
OTHER McGRAW-HILL 
INTERNATIONAL EDITIONS IN 
RELATED FIELDS 

Ahlfors: COMPLEX ANALYSIS, 3/e 
Bender: ADVANCED MATHEMATICAL METHODS FOR 
SCIENTISTS AND ENGINEERS 
Chase: ELEMENTARY STATISTICAL PROCEDURES, 2/e 
Churchill: COMPLEX VARIABLES AND APPLICATIONS, 4/e 
Churchill: FOURIER SERIES AND BOUNDARY VALUE PROBLEMS, 3/e 
Dixon: INTRODUCTION TO STATISTICAL ANALYSIS, 4/e 
Grant: STATISTICAL QUALITY CONTROL, 5/e 
Martin: ORDINARY DIFFERENTIAL EQUATIONS 
Mood-Graybill: INTRODUCTION TO THE THEORY OF STATISTICS, 3/e 
Morrison: MULTIVARIATE STATISTICAL METHODS, 2/e 
Papoulis: PROBABILITY, RANDOM VARIABLES AND STOCHASTIC 
PROCESSES, 2/e 

Parzynski: INTRODUCTION TO MATHEMATICAL ANALYSIS 
Pipes: APPLIED MATHEMATICS FOR ENGINEERS AND PHYSICISTS, 3/e 
Ralston-Rabinowitz: A FIRST COURSE IN NUMERICAL ANALYSIS, 2/e 
Simmons: INTRODUCTION TO TOPOLOGY AND MODERN ANALYSIS 
Stein: CALCULUS AND ANALYTIC GEOMETRY, 3/e 
Woodroofe: PROBABILITY WITH APPLICATIONS 
Wylie: ADVANCED ENGINEERING MATHEMATICS, 5/e 
Wylie: DIFFERENTIAL EQUATIONS 


c 

2 

3 


m 

O) 

O 

T1 


m 

S 

§ 

o 

> 

r- 

> 

z 

m 

55 


W H 
P* tr 

$■: s' 

o ^ 

p 



Principles of 

Mathematical 

Analysis 

THIRD EDITION 


WALTER RUDIN 



Me 

Graw 


Hill 


McGRAW-HILL INTERNATIONAL EDITOINS 

Mathematics Series 


□ -□7-DA5Lil3-3 




PRINCIPLES OF 

MATHEMATICAL 

ANALYSIS 



McGRAW-HILL BOOK COMPANY 

Auckland Bogota Guatemala Hamburg Lisbon 

London Madrid Mexico New Delhi Panama Paris San Juan 

Sao Paulo Singapore Sydney Tokyo 


WALTER RUDIN 

Professor of Mathematics 
University of Wisconsin — Madison 


Principles of 

Mathematical 

Analysis 


THIRD EDITION 



PRINCIPLES OF MATHEMATICAL ANALYSIS, Third Edition 
International Editions 1976 

Exclusive rights by McGraw-Hill Book Co. - Singapore for manufacture and 
export. This book cannot be re-exported from the country to which it is consigned 
by McGraw-Hill. 

Copyright © 1964, 1976 by McGraw-Hill, Inc. All rights reserved. 

Copyright © 1953 by McGraw-Hill, Inc. All rights reserved. 

Except as permitted under the United States Copyright Act of 1976, no part 
of this publication may be reproduced or distributed in any form or by any 
means, or stored in a data base or retrieval system, without the prior written 
permission of the publisher. 


9 3 30 KKP 2 0 9 


Library of Congress Cataloging-in-Publication Data 

Rudin, Walter, date 

Principles of mathematical analysis. 

(International series in pure and applied mathematics) 
Bibliography: p. 

Includes index. 

1 . Mathematical analysis I. Title. 

QA300.R8 1976 515 75-17903 

ISBN 0-07-054234-X 


When ordering this title, use ISBN 0-07-085613-3 


Printed in Singapore 



CONTENTS 


Preface ix 

Chapter 1 The Real and Complex Number Systems 1 

Introduction 1 

Ordered Sets 3 

Fields 5 

The Real Field 8 

The Extended Real Number System 11 

The Complex Field 12 

Euclidean Spaces 16 

Appendix 17 

Exercises 21 

Chapter 2 Basic Topology 24 

Finite, Countable, and Uncountable Sets 24 

Metric Spaces 30 

Compact Sets 36 

Perfect Sets 41 



Vi CONTENTS 


Connected Sets 42 

Exercises 43 

Chapter 3 Numerical Sequences and Series 47 

Convergent Sequences 47 

Subsequences 51 

Cauchy Sequences 52 

Upper and Lower Limits 55 

Some Special Sequences 57 

Series 58 

Series of Nonnegative Terms 61 

The Number e 63 

The Root and Ratio Tests 65 

Power Series 69 

Summation by Parts 70 

Absolute Convergence 71 

Addition and Multiplication of Series 72 

Rearrangements 75 

Exercises 78 

Chapter 4 Continuity 83 

Limits of Functions 83 

Continuous Functions 85 

Continuity and Compactness 89 

Continuity and Connectedness 93 

Discontinuities 94 

Monotonic Functions 95 

Infinite Limits and Limits at Infinity 97 

Exercises 98 

Chapter 5 Differentiation 103 

The Derivative of a Real Function 103 

Mean Value Theorems 107 

The Continuity of Derivatives 108 

IL’Hospital’s Rule 109 

Derivatives of Fligher Order 1 10 

Taylor’s Theorem 1 10 

Differentiation of Vector-valued Functions 1 1 1 

Exercises 114 



CONTENTS Vii 


Chapter 6 The Riemann-Stieltjes Integral 120 

Definition and Existence of the Integral 120 

Properties of the Integral 128 

Integration and Differentiation 1 33 

Integration of Vector- valued Functions 135 

Rectifiable Curves 1 36 

Exercises 138 

Chapter 7 Sequences and Series of Functions 143 

Discussion of Main Problem 143 

Uniform Convergence 147 

Uniform Convergence and Continuity 149 

Uniform Convergence and Integration 151 

Uniform Convergence and Differentiation 152 

Equicontinuous Families of Functions 154 

The Stone-Weierstrass Theorem 159 

Exercises 165 

Chapter 8 Some Special Functions 172 

Power Series 172 

The Exponential and Logarithmic Functions 178 

The Trigonometric Functions 182 

The Algebraic Completeness of the Complex Field 184 

Fourier Series 185 

The Gamma Function 192 

Exercises 196 

Chapter 9 Functions of Several Variables 204 

Linear Transformations 204 

Differentiation 211 

The Contraction Principle 220 

The Inverse Function Theorem 221 

The Implicit Function Theorem 223 

The Rank Theorem 228 

Determinants 231 

Derivatives of Higher Order 235 

Differentiation of Integrals 236 

Exercises 239 

Chapter 10 Integration of Differential Forms 245 

Integration 245 



Viii CONTENTS 


Primitive Mappings 248 

Partitions of Unity 251 

Change of Variables 252 

Differential Forms 253 

Simplexes and Chains 256 

Stokes’ Theorem 273 

Closed Forms and Exact Forms 275 

Vector Analysis 280 

Exercises 288 

Chapter 11 The Lebesgue Theory 300 

Set Functions 300 

Construction of the Lebesgue Measure 302 

Measure Spaces 310 

Measurable Functions 310 

Simple Functions 313 

Integration 314 

Comparison with the Riemann Integral 322 

Integration of Complex Functions 325 

Functions of Class if 2 325 

Exercises 332 

Bibliography 335 

List of Special Symbols 337 


Index 


339 



PREFACE 


This book is intended to serve as a text for the course in analysis that is usually 
taken by advanced undergraduates or by first-year students who study mathe- 
matics. 

The present edition covers essentially the same topics as the second one, 
with some additions, a few minor omissions, and considerable rearrangement. I 
hope that these changes will make the material more accessible amd more attrac- 
tive to the students who take such a course. 

Experience has convinced me that it is pedagogically unsound (though 
logically correct) to start off with the construction of the real numbers from the 
rational ones. At the beginning, most students simply fail to appreciate the need 
for doing this. Accordingly, the real number system is introduced as an ordered 
field with the least-upper-bound property, and a few interesting applications of 
this property are quickly made. However, Dedekind’s construction is not omit- 
ted. It is now in an Appendix to Chapter 1, where it may be studied and enjoyed 
whenever the time seems ripe. 

The material on functions of several variables is almost completely re- 
written, with many details filled in, and with more examples and more motiva- 
tion. The proof of the inverse function theorem — the key item in Chapter 9 — is 
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simplified by means of the fixed point theorem about contraction mappings. 
Differential forms are discussed in much greater detail. Several applications of 
Stokes’ theorem are included. 

As regards other changes, the chapter on the Riemann-Stieltjes integral 
has been trimmed a bit, a short do-it-yourself section on the gamma function 
has been added to Chapter 8, and there is a large number of new exercises, most 
of them with fairly detailed hints. 

I have also included several references to articles appearing in the American 
Mathematical Monthly and in Mathematics Magazine , in the hope that students 
will develop the habit of looking into the journal literature. Most of these 
references were kindly supplied by R. B. Burckel. 

Over the years, many people, students as well as teachers, have sent me 
corrections, criticisms, and other comments concerning the previous editions 
of this book. 1 have appreciated these, and I take this opportunity to express 
my sincere thanks to all who have written me. 


WALTER RUDIN 



1 

THE REAL AND COMPLEX NUMBER SYSTEMS 


INTRODUCTION 

A satisfactory discussion of the main concepts of analysis (such as convergence, 
continuity, differentiation, and integration) must be based on an accurately 
defined number concept. We shall not, however, enter into any discussion of 
the axioms that govern the arithmetic of the integers, but assume familiarity 
with the rational numbers (i.e., the numbers of the form w/h, where m and n 
are integers and n / 0). 

The rational number system is inadequate for many purposes, both as a 
field and as an ordered set. (These terms will be defined in Secs. 1.6 and 1.12.) 
For instance, there is no rational p such that p 2 = 2. (We shall prove this 
presently.) This leads to the introduction of so-called “irrational numbers” 
which are often written as infinite decimal expansions and are considered to be 
“approximated” by the corresponding finite decimals. Thus the sequence 

1, 1.4, 1.41, 1.414, 1.4142, . . . 

“tends to ^/2.” But unless the irrational number Jl has been clearly defined, 
the question must arise: Just what is it that this sequence “tends to”? 
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This sort of question can be answered as soon as the so-called “real 
number system” is constructed. 


1.1 Example We now show that the equation 

(1) p 2 = 2 

is not satisfied by any rational p. I f there were such a p , we could write p = m/n 
where m and n are integers that are not both even. Let us assume this is done. 
Then (1) implies 

(2) m 2 = In 2 , 

This shows that m 2 is even. Hence m is even (if m were odd, m 2 would be odd), 
and so m 2 is divisible by 4. It follows that the right side of (2) is divisible by 4, 
so that n 2 is even, which implies that n is even. 

The assumption that (1) holds thus leads to the conclusion that both m 
and n are even, contrary to our choice of m and n. Hence (1) is impossible for 
rational p. 

We now examine this situation a little more closely. Let A be the set of 
all positive rationals p such that p 2 <2 and let B consist of all positive rationals 
p such that p 2 > 2. We shall show that A contains no largest number and B con- 
tains no smallest . 

More explicitly, for every p in A we can find a rational q in A such that 
p <q, and for every p in B we can find a rational q in B such that q < p. 

To do this, we associate with each rational p > 0 the number 


( 3 ) 

Then 


q=p- 


P 2 - 2 
P + 2 


2p + 2 

P + 2' 


( 4 ) 


< 7- 2 2 = 


- 2 ) 

(P + 2) 2 ' 


If p is in A then p 2 — 2 < 0, (3) shows that q > /?, and (4) shows that 
q 2 < 2. Thus q is in A. 

If p is in B then p 2 — 2 > 0, (3) shows that 0 < q < p, and (4) shows that 
q 2 > 2. Thus q is in B. 


1.2 Remark The purpose of the above discussion has been to show that the 
rational number system has certain gaps, in spite of the fact that between any 
two rationals there is another: If r < s then r < (r + s)/2 < s. The real number 
system fills these gaps. This is the principal reason for the fundamental role 
which it plays in analysis. 
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In order to elucidate its structure, as well as that of the complex numbers, 
we start with a brief discussion of the general concepts of ordered set and field. 

Here is some of the standard set-theoretic terminology that will be used 
throughout this book. 

1.3 Definitions If A is any set (whose elements may be numbers or any other 
objects), we write xeA to indicate that x is a member (or an element) of A. 

If x is not a member of A, we write: x 4 A. 

The set which contains no element will be called the empty set. If a set has 
at least one element, it is called nonempty. 

If A and B are sets, and if every element of A is an element of B, we say 
that A is a subset of B, and write A cz B, or B ^ A. If, in addition, there is an 
element of B which is not in A , then A is said to be a proper subset of B. Note 
that A c= A for every set A. 

If A c= B and B c= A, we write A — B. Otherwise A ^ B. 

1.4 Definition Throughout Chap. 1, the set of all rational numbers will be 
denoted by Q. 

ORDERED SETS 

1.5 Definition Let 5 be a set. An order on 5 is a relation, denoted by <, with 
the following two properties: 

(i) If x 6 S and y e S then one and only one of the statements 

x< y, x = y, y < x 

is true. 

(ii) If x, y, z e S, if x < y and y < x, then x < z. 

The statement “x < y” may be read as “x is less than y 99 or “x is smaller 
than y” or “x precedes 

It is often convenient to write y > x in place of x < y. 

The notation x < y indicates that x < y or x = y, without specifying which 
of these two is to hold. In other words, x < y is the negation of x > y. 

1.6 Definition An ordered set is a set S in which an order is defined. 

For example, Q is an ordered set if r < s is defined to mean that s — r is a 
positive rational number. 

1.7 Definition Suppose S' is an ordered set, and EczS. If there exists a 
P e S such that x < p for every .v e E , we say that E is bounded above , and call 
P an upper bound of E. 

Lower bounds are defined in the same way (with > in place of <). 
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1.8 Definition Suppose S is an ordered set, E <z S, and E is bounded above. 
Suppose there exists an a e S with the following properties: 

(i) a is an upper bound of E . 

(ii) If y < a then y is not an upper bound of E . 

Then a is called the least upper bound of E [that there is at most one such 
a is clear from (ii)] or the supremum of E , and we write 

a = sup E. 

The greatest lower bound , or infimum , of a set is which is bounded below 
is defined in the same manner: The statement 

a = inf E 

means that a is a lower bound of E and that no ft with ft > ol is a lower bound 
of E . 


1.9 Examples 

(a) Consider the sets A and B of Example 1.1 as subsets of the ordered 
set Q. The set A is bounded above. In fact, the upper bounds of A are 
exactly the members of B. Since B contains no smallest member, A has 
no least upper bound in Q. 

Similarly, B is bounded below: The set of all lower bounds of B 
consists of A and of all r e Q with r < 0. Since A has no lasgest member, 
B has no greatest lower bound in Q. 

(b) If a = sup E exists, then a may or may not be a member of E. For 
instance, let E x be the set of all r e Q with r < 0. Let E 2 be the set of all 
r e Q with r < 0. Then 

sup E l = sup E 2 = 0, 

and 0 $ E u 0 6 E 2 . 

(c) Let E consist of all numbers l/n, where « = 1, 2, 3, ... . Then 
sup E = 1, which is in E , and inf E — 0, which is not in E. 

1.10 Definition An ordered set S is said to have the least-upper-bound property 
if the following is true: 

If E <= S, E is not empty, and E is bounded above, then sup E exists in S. 
Example 1 .9(a) shows that Q does not have the least-upper-bound property. 
We shall now show that there is a close relation between greatest lower 
bounds and least upper bounds, and that every ordered set with the least-upper- 
bound property also has the greatest-lower-bound property. 
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1.11 Theorem Suppose S is an ordered set with the least-upper-bound property , 
B <= S, B is not empty , and B is bounded below. Let L be the set of all lower 
bounds of B. Then 

a = supL 

exists in S , and a = inf B. 

In particular , inf B exists in S. 

Proof Since B is bounded below, L is not empty. Since L consists of 
exactly those yeS which satisfy the inequality y < x for every x e B, we 
see that every x e B is an upper bound of L. Thus L is bounded above. 
Our hypothesis about S implies therefore that L has a supremum in S\ 
call it a. 

If y < a then (see Definition 1.8) y is not an upper bound of L, 
hence y i B. It follows that a < x for every x e B. Thus a e L. 

If a < P then p 4 L, since a is an upper bound of L. 

We have shown that a e L but p 4 L if P > a - 1° other words, a 
is a lower bound of B , but p is not if P > a. This means that a = inf B. 


FIELDS 

1.12 Definition A field is a set F with two operations, called addition and 
multiplication , which satisfy the following so-called “field axioms” (A), (M), 
and (D): 

(A) Axioms for addition 

(Al) If x g F and y e F, then their sum x 4- y is in F. 

(A2) Addition is commutative: x + y = y + x for all x, y e F. 

(A3) Addition is associative: (x + y) + z = x + (y -I- z) for all x, y, z e F. 
(A4) F contains an element 0 such that 0 + x = x for every x e F. 

(A5) To every xeF corresponds an element —xeF such that 

x + ( — x) = 0. 

(M) Axioms for multiplication 

(Ml) If x e Fand y e F, then their product xy is in F. 

(M2) Multiplication is commutative: xy = yx for all x, y e F. 

(M3) Multiplication is associative: ( xy)z = x(yz) for all x, y, z e F. 

(M4) F contains an element 1 # 0 such that lx = x for every xeF. 

(M5) If x e F and x ^ 0 then there exists an element 1 /xeF such that 

x*(l/x)= 1. 
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(D) The distributive law 

x(y + z) — xy + xz 

holds for all a, y, z e F. 

1.13 Remarks 

(a) One usually writes (in any field) 

X j 3 /x 

x — y,-, x + y + z, xyz, x , a , 2a, 3a, . . . 

y 

in place of 

* + (-y), x • (^-) , (a + y) 4- z, ( xy)z , aa, aaa, a + x, x + x + x, 

(b) The field axioms clearly hold in Q , the set of all rational numbers, if 
addition and multiplication have their customary meaning. Thus Q is a 
field. 

(c) Although it is not our purpose to study fields (or any other algebraic 
structures) in detail, it is worthwhile to prove that some familiar properties 
of Q are consequences of the field axioms; once we do this, we will not 
need to do it again for the real numbers and for the complex numbers. 

1.14 Proposition The axioms for addition imply the following statements. 

(a) If x + y = x + z then y = z. 

( b > If x + y = x then y = 0. 

(c) If x + y = 0 then y = —x. 

(d) -(-*) = *. 

Statement (a) is a cancellation law. Note that (b) asserts the uniqueness 
of the element whose existence is assumed in (A4), and that (c) does the same 
for (A5). 

Proof If x + y = x + z, the axioms (A) give 

y = 0 + y = ( -x + x) + y = -x + (x -f y) 

= —x + (x + z) = ( — x + x) + z = 0+ z = z. 

This proves (a). Take z = 0 in (a) to obtain (b). Take z = — a in (a) to 

obtain (c). 

Since — x 4- a- = 0, (c) (with —a* in place of a) gives (d). 
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1.15 Proposition The axioms for multiplication imply the following statements. 

(a) If x # 0 and xy = xz then y = z. 

(b) If x ^ 0 and xy = x then y = 1 . 

(c) If x # 0 and xy = 1 then y = 1/x. 

(< d ) If x # 0 then 1/(1/*) = x. 

The proof is so similar to that of Proposition 1.14 that we omit it. 

1.16 Proposition The field axioms imply the following statements, for any x , y, 
zgF. 

(a) 0x = 0. 

(b) If x ^ 0 and y ^ 0 then xy # 0. 

(c) (-x)y = - (xy) = x( -y). 

(d) (~x)(-y) = xy. 

Proof Ox + Ox = (0 + 0)x = Ox. Hence 1.14(6) implies that Ox = 0, and 
(a) holds. 

Next, assume x ^ 0, y ^ 0, but xy = 0. Then (a) gives 



a contradiction. Thus (b) holds. 

The first equality in (c) comes from 

( — x)y + xy = ( — x + x)y = Oy = 0, 

combined with 1.14(c); the other half of (c) is proved in the same way. 
Finally, 

(--*)(->’) = -W-j)] = -[-(*>’)] = xy 
by (c) and 1.14 (d). 

1.17 Definition An ordered field is a field F which is also an ordered set , such 
that 

(i) x + y < x + z if x, y, z e Fand y < z, 

(ii) xy > 0 if x e F, y e F, x > 0, and y > 0. 

If x > 0, we call x positive ; if x < 0, x is negative. 

For example, Q is an ordered field. 

All the familiar rules for working with inequalities apply in every ordered 
field: Multiplication by positive [negative] quantities preserves [reverses] in- 
equalities, no square is negative, etc. The following proposition lists some of 
these. 
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1.18 Proposition The following statements are true in every ordered field. 


00 


: > 

0 then — x < 0, and vice versa. 

(b) 


; > 

0 and y < z then xy < xz. 

00 

If* 

< 

0 and y < z then xy > xz. 

(d) 

If* 

:# 

0 then x 2 > 0. In particular , 1 > 0. 

00 

If 0 

< 

* 

A 

S- 

§ 

o 

A 

vT 

A 

'S' 

Proof 



00 

If A 

: > 

0 then 0 = — x + x > — x + 0, so that 


0 = — x + x < — x 4- 0, so that — x > 0. This proves (a). 

( b ) Since z > y, we have z— y>y — y = 0, hence x(z 
therefore 


If x < 0 then 
— y) > 0, and 


xz = x(z — y) + xy > 0 4- xy = xy. 

(c) By (a), ( b ), and Proposition 1.16(c), 

-[x( Z -y)] = (-x)(z-y)>0, 

so that x(z — y) < 0, hence xz < xy. 

(d) If x > 0, part (ii) of Definition 1.17 gives x 2 >0. If x < 0, then 
— x > 0, hence (— x) 2 > 0. But x 2 = ( — x) 2 , by Proposition 1.16 (d). 
Since 1 = l 2 , 1 > 0. 

(c) If y > 0 and v < 0, then yv < 0. But y - (\/y) = ] >0. Hence l/y > 0. 
Likewise, 1/x > 0. If we multiply both sides of the inequality x < y by 
the positive quantity (l/x)(l /y), we obtain \/y < 1/x. 


THE REAL FIELD 

We now state the existence theorem which is the core of this chapter. 

1.19 Theorem There exists an ordered field R which has the least-upper-bound 
property. 

Moreover , R contains Q as a subfield. 

The second statement means that Q c= R and that the operations of 
addition and multiplication in /?, when applied to members of Q , coincide with 
the usual operations on rational numbers; also, the positive rational numbers 
are positive elements of R. 

The members of R are called real numbers. 

The proof of Theorem 1.19 is rather long and a bit tedious and is therefore 
presented in an Appendix to Chap. 1. The proof actually constructs R from Q. 
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The next theorem could be extracted from this construction with very 
little extra effort. However, we prefer to derive it from Theorem 1.19 since this 
provides a good illustration of what one can do with the least-upper-bound 
property. 

1.20 Theorem 

(a) If x 6 R, y e R, and x > 0, then there is a positive integer n such that 

nx > y. 

(b) If x e R,y g R, and x < y, then there exists a p e Q such that x < p < y. 

Part (a) is usually referred to as the archimedean property of R. Part ( b ) 
may be stated by saying that Q is dense in R: Between any two real numbers 
there is a rational one. 

Proof 

(a) Let A be the set of all nx , where n runs through the positive in ,cgers. 
If (a) were false, then y would be an upper bound of A. But then A has a 
least upper bound in R. Put a = sup A. Since * > 0, a — x < a, and 
a — x is not an upper bound of A. Hence a — x < mx for some positive 
integer m. But then a < (m 4- l)xe A , which is impossible, since a is an 
upper bound of A. 

(b) Since x < y, we have y — x > 0, and (a) furnishes a positive integer 
n such that 

n(y - x) > 1 . 

Apply (a) again, to obtain positive integers m { and m 2 such that m Y > nx , 
m 2 > —nx. Then 

— m 2 < nx < m l . 

Hence there is an integer m (with —m 2 <m< m { ) such that 

m — 1 < nx < m. 

If we combine these inequalities, we obtain 

nx < m < 1 + nx < ny. 

Since n > 0, it follows that 


m 

x < — < y. 

n 


This proves ( b\ with p = m/n. 
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We shall now prove the existence of nth roots of positive reals. This 
proof will show how the difficulty pointed out in the Introduction (irration- 
ality of yjl) can be handled in R. 

1.21 Theorem For every real x > 0 and every integer n > 0 there is one 
and only one real y such that y n = x. 

This number y is written f/x or x 1/n . 

Proof That there is at most one such y is clear, since 0 < y\ < y 2 implies 

/\ < A- 

Let E be the set consisting of all positive real numbers t such that 
t n < x . 

If / = x/(\ + x) then 0 < t < 1. Hence t n < t < x. Thus t e £, and 
E is not empty. 

If t > 1 + x then t n > t > x, so that t i E. Thus 1 4- x is an upper 
bound of E. 

Hence Theorem 1.19 implies the existence of 
y = sup E. 

To prove that / = .v we will show that each of the inequalities y n < x 
and y > x leads to a contradiction. 

The identity b n — a n = (b — a)(b n ~ l + b n ~ 2 a + • • • + a n ~ x ) yields 
the inequality 

If - a" <(b - a)nb n ~ l 

when 0 < a < b. 

Assume y n < x. Choose h so that 0 < h < 1 and 

. 

n(y \- l)" -1 

Put a = y, b = y + h. Then 

(y 4- h) n — y n < hn(y + h) n ~ l < hn(y + 1)"" 1 < x — y n . 

Thus (y + h) n < x, and y + he E. Since y + h > y, this contradicts the 
fact that y is an upper bound of E. 

Assume y n > x. Put 

y n - x 


k = 


ny n ~ 1 


Then 0 < k < y. If / > >> — k, we conclude that 

A — t n <y n — (y — k) n < kny n ~ x = y n — x. 

Thus t n > x, and t $ E. It follows that y — k is an upper bound of E. 
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But y — k < y, which contradicts the fact that y is the least upper bound 
of E. 

Hence y n = *, and the proof is complete. 

Corollary If a and b are positive real numbers and n is a positive integer , then 

(ab) lln = a lln b l/n . 

Proof Put a = a x/n , P = b l/n . Then 

ab = = (a/?)", 

since multiplication is commutative. [Axiom (M2) in Definition 1.12.] 
The uniqueness assertion of Theorem 1.21 shows therefore that 

(ab) i/n = (xp = a' ln b l,n . 


1.22 Decimals We conclude this section by pointing out the relation between 
real numbers and decimals. 

Let .y > 0 be real. Let n 0 be the largest integer such that n 0 < x . (Note that 
the existence of n 0 depends on the archimedean property of R.) Having chosen 
n 0 , /7j, . . . , /?*_!, let n k be the largest integer such that 


n \ 

+ To + - 


‘ + 


Hl 

io k 


< „Y. 


Let E be the set of these numbers 


(5) ”° + 10 + + 10 s (A: = 0, 1, 2, . . .). 

Then x — sup £. The decimal expansion of a* is 

( 6 ) «0 ’ ‘ ' • 

Conversely, for any infinite decimal (6) the set E of numbers (5) is bounded 
above, and (6) is the decimal expansion of sup E. 

Since we shall never use decimals, we do not enter into a detailed 
discussion. 


THE EXTENDED REAL NUMBER SYSTEM 

1.23 Definition The extended real number system consists of the real field R 
and two symbols, + oo and — oc. We preserve the original order in /?, and 
define 


for every .ve R. 


— 00 < X < + 00 
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It is then clear that + oo is an upper bound of every subset of the extended 
real number system, and that every nonempty subset has a least upper bound. 
If, for example, £ is a nonempty set of real numbers which is not bounded 
above in R , then sup E = + oo in the extended real number system. 

Exactly the same remarks apply to lower bounds. 

The extended real number system does not form a field, but it is customary 
to make the following conventions: 

(a) If x is real then 

x x 

x+oo=+oo, X— 00 = — oo, = — — = 0. 

-l-oo — 00 

( b ) If x > 0 then x • (+ oo) = + oo, x • (— oo) = — oo. 

(c) If x < 0 then x • ( + oo) = — oo, x • (— oo) = + oo. 

When it is desired to make the distinction between real numbers on the 
one hand and the symbols + oo and — oo on the other quite explicit, the former 
are called finite. 


THE COMPLEX FIELD 

1.24 Definition A complex number is an ordered pair (a, b) of real numbers. 
“Ordered” means that (< a , b) and ( b , a) are regarded as distinct if a # b. 

Let x = (a, b), y = (c, d) be two complex numbers. We write x = y if and 
only if a = c and b = d. (Note that this definition is not entirely superfluous; 
think of equality of rational numbers, represented as quotients of integers.) We 
define 

x + y = (a + c, b + d\ 
xy = ( ac — ba, ad + be). 


1.25 Theorem These definitions of addition and multiplication turn the set of 
all complex numbers into a field , with (0, 0) and (1,0) in the role of 0 and 1. 

Proof We simply verify the field axioms, as listed in Definition 1.12. 
(Of course, we use the field structure of R.) 


Let x = (a, b), y = ( c , d), z = (e,/). 

(Al) is clear. 

(A2) x + y = (a + c, b + d) = (c + a, d + b) = y + x. 
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(A3) (x + y) + z = (a 4- c, b + d) + (ej) 

= (a + c + e, b + d +/) 

= (a, b) + (c 4- e, d +/) = * + (y + z). 

(A4) * + 0 = (a, b) + (0, 0) = (a, b) = x. 

(A5) Put —x = (- a , -b). Then x 4- (— x) = (0, 0) = 0. 

(Ml) is clear. 

(M2) xy = ( ac — bd , ad + be) = ( ca — db , da + cb) = yx. 

(M3) ( xy)z = (ac — bd , ad + bc)(e,f) 

= (ace — bde — adf — bef \ acf — bdf + ade 4- bee) 

= (a, b)(ce - df, cf + de) = x(yz). 

(M4) lx = (1, 0)(a, b) = (a, b) = x. 

(M5) If x t* 0 then (< a , b) # (0, 0), which means that at least one of the 
real numbers a , b is different from 0. Hence a 1 + b 2 > 0, by Proposition 
1.18(rf), and we can define 

x \a 2 + b 2 

Then 

(D) x(y + z) = (a, b)(c + e,d +f) 

= (ac + ae — bd — bf, ad + af + be + be) 

= (ac — bd , ad 4- be) + (ae — bf, af 4- be) 

= xy 4- xz. 

1.26 Theorem For any real numbers a and b we have 

(t a , 0) 4- (b, 0) = (a + 6, 0), (a, OX*, 0) = (aft, 0). 

The proof is trivial. 



Theorem 1.26 shows that the complex numbers of the form (a, 0) have the 
same arithmetic properties as the corresponding real numbers a. We can there- 
fore identify (< a , 0) with a. This identification gives us the real field as a subfield 
of the complex field. 

The reader may have noticed that we have defined the complex numbers 
without any reference to the mysterious square root of — 1. We now show that 
the notation (a, b) is equivalent to the more customary a -I- bi. 


1.27 Definition / = (0, 1). 
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1.28 Theorem i 2 = — 1. 

Proof i 2 = (0, 1)(0, 1) = (-1, 0) = -1. 


1.29 Theorem If a and b are real , then {a, b) = a + bi. 

Proof 

a + bi= (a, 0) + 0 b , 0)(0, 1) 

= (a, 0) + (0, b) = (c a , b). 


1.30 Definition If a , b are real and z = a + bi, then the complex number 
z = a — bi is called the conjugate of z. The numbers a and b are the real part 
and the imaginary part of z, respectively. 

We shall occasionally write 

a = Re(z), b = Im(z). 


1.31 Theorem If z and w are complex, then 

(< a ) z 4- w = z + vv, 

(b) zw = z • vv, 

(c) z + z = 2 Re(z), z — z = 2i Im(z), 

(< d ) zz w rea/ and positive ( except when z = 0). 

Proof (a), (6), and (c) are quite trivial. To prove (d), write z = a 4- bi, 
and note that zz = a 2 -f b 2 . 


1.32 Definition If z is a complex number, its absolute value \z\ is the non- 
negative square root of zz; that is, |z| = (zz) 1/2 . 

The existence (and uniqueness) of |z| follows from Theorem 1.21 and 
part (d) of Theorem 1.31. 

Note that when x is real, then x = x, hence \x\ =^/x 2 . Thus |x| =x 
if x > 0, | jc | = —x if x < 0. 

1.33 Theorem Let z and w be complex numbers. Then 

(a) | z | > 0 unless z = 0, 1 0 1 =0, 

(b) \z\ = \z\, 

(c) | zw | = | 2 1 I W I , 
id) | Re z| < |z| , 

(e) |z -t w| < |z| + |w|. 
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Proof (a) and (b) are trivial. Put z = a 4- bi, w = c + di, with a , b, c , d 
real. Then 

| zw | 2 = (ac — bd ) 2 + (at/ 4- be) 2 = (a 2 + 6 2 )(c 2 4- d 2 ) = |z| 2 | w| 2 

or |zw| 2 = (|z| | vv|) 2 . Now (c) follows from the uniqueness assertion of 
Theorem 1.21. 

To prove (d), note that a 2 < a 2 4- b 2 , hence 
M = Ja 2 < Ja 2 + b 2 . 

To prove ( e ), note that zw is the conjugate of zvv, so that zw 4- zw = 
2 Re (zvv). Hence 

| z 4- w | 2 = (z 4- w)(z 4- vv) = zz -f- zvv + zw 4- wvv 
= |z| 2 -f 2 Re (zvv) + | w| 2 
< |z| 2 4-2 |zvv| 4- |w| 2 
= M 2 + 2 |z|M + |w| 2 = (|z| + |w|) 2 . 

Now ( e ) follows by taking square roots. 

1.34 Notation If x l9 . . . , x n are complex numbers, we write 

n 

X 1 + *2 +•••+*„= £ X; . 

j= 1 

We conclude this section with an important inequality, usually known as 
the Schwarz inequality. 

1.35 Theorem If a { , . . . , a n and b u . . . , b n are complex numbers , then 

1 ajBj 2 *i\aj\ 2 i\bj\*. 

j= i J '= 1 

Proof Put A = 1 1 aj | 2 , B = X | ^ | 2 , C = Xoy (in all sums in this proof, 
j runs over the values 1 ,...,«). If B = 0, then b t = ••• = b H = 0, and the 
conclusion is trivial. Assume therefore that B > 0. By Theorem 1.31 we 
have 

X | Baj - Cb ,[■ 2 = I (Baj - Cb^Baj - Cbj ) 

= £ 2 X | | : 2 - BC X a, Bj - 2? C I a y b, + | C | ■ 2 E | b, \ 2 

= b 2 a -b\c\ 2 

= B(AB - |C| 2 ). 



16 PRINCIPLES OF MATHEMATICAL ANALYSIS 


Since each term in the first sum is nonnegative, we see that 

B(AB- \C\ 2 )> 0. 

Since B > 0, it follows that AB — | C\ 2 > 0. This is the desired inequality. 


EUCLIDEAN SPACES 

1.36 Definitions For each positive integer k , let R k be the set of all ordered 
A>tuples 

x = (x 1 ,x 2 ,...,x k ), 

where x u . . . , x k are real numbers, called the coordinates of x. The elements pf 
R k are called points, or vectors, especially when k > 1. We shall denote vectors 
by boldfaced letters. If y = (yu . . . , y k ) and if a is a real number, put 

x + y = + y k \ 

ax = (cxx ! , . . . , a**) 

so that x + y e R k and ax e R k . This defines addition of vectors, as well as 
multiplication of a vector by a real number (a scalar). These two operations 
satisfy the commutative, associative, and distributive laws (the proof is trivial, 
in view of the analogous laws for the real numbers) and make R k into a vector 
space over the real field. The zero element of R k (sometimes called the origin or 
the null vector) is the point 0, all of whose coordinates are 0. 

We also define the so-called “inner product” (or scalar product) of x and 

y by 

k 

x • y = Z x,y, 

i= 1 

and the norm of x by 

/* \ 1/2 
I x | =(x-x) 1/2 • 

The structure now defined (the vector space R k with the above inner 
product and norm) is called euclidean k- space. 

1.37 Theorem Suppose x, y, z e R k , and a is real. Then 

(a) | x | > 0; 

(b) |x|=0// and only if x = 0; 

(c) 1 0 CX I = I a 1 1 x I ; 

(d) |x • y| < | x 1 1 y | ; 

(e) |x + y| <|x| + | y | ; 

(/) |x-z| < | x — y | + |y-z|. 
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Proof (a), (, b ), and ( c ) are obvious, and ( d ) is an immediate consequence 
of the Schwarz inequality. By ( d ) we have 

|x + y| 2 = (x + y)-(x + y) 

= x- x+2x-y + y- y 
|x| 2 + 2|x| | y | + |y| 2 

-(1*1 + lyl) 2 - 

so that (e) is pfoved. Finally, (/) follows from (e) if we replace x by 
x - y and y by y - z. 

1.38 Remarks Theorem 1.37 ( a ), ( b ), and (/) will allow us (see Chap. 2) to 
regard R k as a metric space. 

R 1 (the set of all real numbers) is usually called the line, or the real line. 
Likewise, R 2 is called the plane, or the complex plane (compare Definitions 1.24 
and 1.36). In these two cases the norm is just the absolute value of the corre- 
sponding real or complex number. 


APPENDIX 

Theorem 1.19 will be proved in this appendix by constructing R from Q. We 
shall divide the construction into several steps. 

Step 1 The members of R will be certain subsets of Q , called cuts. A cut is, 
by definition, any set a <= Q with the following three properties. 

(I) a is not empty, a ^ Q. 

(II) If p e a, q e Q, and q < p, then q e a. 

(III) If p e a, then p < r for some r e cl. 

The letters p,q, r, ... will always denote rational numbers, and a, /?, y, . . . 
will denote cuts. 

Note that (III) simply says that a has no largest member; (II) implies two 
facts which will be used freely: 

If p e a and q 4 a then p <q. 

If r $ a and r < s then s 4 a. 

Step 2 Define “a < /?” to mean: a is a proper subset of /?. 

Let us check that this meets the requirements of Definition 1.5. 

If a < P and P < y it is clear that a < y. (A proper subset of a proper sub- 
set is a proper subset.) It is also clear that at most one of the three relations 

a < P, a = P, p < cl 
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can hold for any pair oc, p. To show that at least one holds, assume that the 
first two fail. Then oc is not a subset of p. Hence there is a p e oc with p £ p. 
If q e P, it follows that q < p (since p £ P), hence q e oc, by (II). Thus P cz oc. 
Since p ^ oc, we conclude: P < oc. 

Thus R is now an ordered set. 

Step 3 The ordered set R has the least-upper-bound property. 

To prove this, let A be a nonempty subset of R , and assume that P e R 
is an upper bound of A. Define y to be the union of all oce A. In other words, 
p e y if and only if p e a for some oce A. We shall prove that y e R and that 
y = sup A. 

Since A is not empty, there exists an a 0 e A. This oc 0 is not empty. Since 
oc o cz y, y is not empty. Next, y cz p (since oc cz p for every oc e A ), and therefore 
y # Q. Thus y satisfies property (I). To prove (II) and (III), pick p e y. Then 
p e oc { for some oc { e A. If q < p, then q e oc { , hence q e y; :his proves (I I). If 
re oc x is so chosen that r > p, we see that re y (since oc Y cz y), and therefore y 
satisfies (III). 

Thus y e R. 

It is clear that oc < y for every a e A. 

Suppose S < y. Then there is an s e y and that s £ d. Since s e y, seoc 
for some cue A. Hence S < a, and S is not an upper bound of A. 

This gives the desired result: y = sup A. 

Step 4 If a e R and p e R we define oc + /? to be the set of all sums r + s, where 
r e oc and s e p. 

We define 0* to be the set of all negative rational numbers. It is clear that 
0* is a cut. We verify that the axioms for addition (see Definition 1.12) hold in 
R , with 0* playing the role of 0. 

(Al) We have to show that a + p is a cut. It is clear that oc + /? is a 
nonempty subset of Q. Take r' £ oc, s' $ p. Then r + s' > r + s for all 
choices of r e a, s e p. Thus r + s’ $ oc 4- p. It follows that oc -f P has 
property (I). 

Pick p e oc 4- p. Then p = r + s, with r e a, s e p. If q < p, then 
q — s < r, so q — sea , and q = (q — s) + s e a + p. Thus (II) holds. 
Choose t e a so that t > r. Then p < t + s and t + s e a + p. Thus (III) 
holds. 

(A2) a + p is the set of all r + s, with r e oc, s e p. By the same definition, 
P + a is the set of all s + r. Since r + s = s + r for all r e Q, s e Q, we 
have oc + p = P -\- oc. 

(A3) As above, this follows from the associative law in Q. 

(A4) I f r e oc and s e 0*, then r + s < r, hence r + s e oc. Thus oc + 0* cz cc. 
To obtain the opposite inclusion, pick p e oc, and pick r e oc, r > p. Then 
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p — r e 0*, and p = r +(/? — r) e a + 0*. Thus a c a + 0*. We conclude 
that a -I- 0* = a. 

(A5) Fix a e R. Let p be the set of all p with the following property: 
There exists r > 0 such that —p — r $ a. 

In other words, some rational number smaller than —p fails to 
be in a. 

We show that P e R and that a + P = 0*. 

If 5 £ a and p = — ^ — 1 , then —p — 1 ^ a, hence pep. So p is not 
empty. If q e a, then —q<£p. So p # Q. Hence p satisfies (I). 

Pick pep, and pick r > 0, so that —p — r$oc. If q < p, then 
— q — r > —p — r, hence —q — r 4 a. Thus q e p, and (II) holds. Put 
t = p +(r/2). Then t > p, and — t — (r/2) = — p — r £ a, so that tep. 
Hence p satisfies (III). 

We have proved that p e R. 

If r e a and s e P, then — s £ a, hence r < — s , r + s < 0. Thus 
a + p <= 0*. 

To prove the opposite inclusion, pick veO*, put w= —v/2. Then 
w > 0, and there is an integer n such that nw e a but (n 4- l)w £ a. (Note 
that this depends on the fact that Q has the archimedean property!) Put 
p = — (n 4- 2)w. Then pep , since —p — w £ a, and 

v=nw+pea + p. 

Thus 0* a + p. 

We conclude that a + P = 0*. 

This p will of course be denoted by —a. 

Step 5 Having proved that the addition defined in Step 4 satisfies Axioms (A) 
of Definition 1.12, it follows that Proposition 1.14 is valid in R , and we can 
prove one of the requirements of Definition 1.17: 

If a, p, y e R and P < y, then a + p < a + y. 

Indeed, it is obvious from the definition of + in R that a 4- p a a + y \ if 
we had a -f p = a + y, the cancellation law (Proposition 1.14) would imply 

P = y- 

It also follows that a > 0* if and only if —a < 0*. 

Step 6 Multiplication is a little more bothersome than addition in the present 
context, since products of negative rationals are positive. For this reason we 
confine ourselves first to R + , the set of all a e R with a > 0*. 

If a e R+ and p e R + , we define cap to be the set of all p such that p <, rs 
for some choice of r e a, s e p, r > 0, s > 0. 

We define 1* to be the set of all q < 1. 
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Then the axioms ( M ) and ( D ) of Definition 1.12 hold, with R+ in place of F \ 
and with 1* in the role of\. 

The proofs are so similar to the ones given in detail in Step 4 that we omit 

them. 

Note, in particular, that the second requirement of Definition 1.17 holds: 
If a > 0* and P > 0* then up > 0*. 

Step 7 We complete the definition of multiplication by setting a0* = 0*a = 0*, 
and by setting 

((-*)(- P) ifa<O*,0<O*, 

up = | — [( — a)/?] if a < 0*, p > 0*, 
t a * ( — P)] if a > 0*, P < 0*. 

The products on the right were defined in Step 6. 

Having proved (in Step 6) that the axioms (M) hold in R + , it is now 
perfectly simple to prove them in R , by repeated application of the identity 
y = — ( — y) which is part of Proposition 1.14. (See Step 5.) 

The proof of the distributive law 

u(P + y) = up + ay 

breaks into cases. For instance, suppose u > 0*, p < 0*, p + y > 0*. Then 

y = (P + y) + ( — P)> and (since we already know that the distributive law holds 

in R + ) 

ay = a(P + y)+ ct ■(-/}). 

But a • (— /?) = — ( up ). Thus 

up -buy = u(P -l- y). 

The other cases are handled in the same way. 

We have now completed the proof that R is an ordered field with the least- 
upper-bound property. 

Step 8 We associate with each r e Q the set r* which consists of all p e Q 
such that p < r. It is clear that each r* is a cut; that is, r * e R. These cuts satisfy 
the following relations: 

(a) r* + = (r + *)*, 

(i b ) = (rs)* f 

( c ) r* < s * if and only if r <s. 

To prove (< a ), choose per* + s*. Then p = u + v, where u < r, v < s. 
Hence p < r + s, which says that p e (r + s)*. 
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Conversely, suppose p e (r + $)*. Then p < r + s. Choose / so that 
2t = r + s — p, put 

r' = r - t, s' = s - t. 

Then r' e r*, s' e $*, and p = r' + s', so that p e r* 4- 5*. 

This proves (tf). The proof of ( b ) is similar. 

I f r < s then res*, but r $ r* ; hence r* < s*. 

If r* < s*, then there is a p e s* such that p $ r*. Hence r < p < s, so 
that r < s. 

This proves (c). 

Step 9 We saw in Step 8 that the replacement of the rational numbers r by the 
corresponding “rational cuts” r* e R preserves sums, products, and order. This 
fact may be expressed by saying that the ordered field Q is isomorphic to the 
ordered field Q* whose elements are the rational cuts. Of course, r* is by no 
means the same as r, but the properties we are concerned with (arithmetic and 
order) are the same in the two fields. 

It is this identification of Q with Q* which allows us to regard Q as a 
subfield of R. 

The second part of Theorem 1.19 is to be understood in terms of this 
identification. Note that the same phenomenon occurs when the real numbers 
are regarded as a subfield of the complex field, and it also occurs at a much 
more elementary level, when the integers are identified with a certain subset of Q. 

It is a fact, which we will not prove here, that any two ordered fields with 
the least-upper-bound property are isomorphic. The first part of Theorem 1.19 
therefore characterizes the real field R completely. 

The books by Landau and Thurston cited in the Bibliography are entirely 
devoted to number systems. Chapter 1 of Knopp’s book contains a more 
leisurely description of how R can be obtained from Q. Another construction, 
in which each real number is defined to be an equivalence class of Cauchy 
sequences of rational numbers (see Chap. 3), is carried out in Sec. 5 of the book 
by Hewitt and Stromberg. 

The cuts in Q which we used here were invented by Dedekind. The 
construction of R from Q by means of Cauchy sequences is due to Cantor. 
Both Cantor and Dedekind published their constructions in 1872. 


EXERCISES 

Unless the contrary is explicitly stated, all numbers that are mentioned in these exer- 
cises are understood to be real. 

1. If r is rational (r ^ 0) and x is irrational, prove that r + x and rx are irrational. 
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2. Prove that there is no rational number whose square is 12. 

3. Prove Proposition 1.15. 

4. Let E be a nonempty subset of an ordered set; suppose a is a lower bound of E 
and is an upper bound of E. Prove that a </*. 

5. Let A be a nonempty set of real numbers which is bounded below. Let —A be 
the set of all numbers —x, where xe A. Prove that 

inf A = —sup (—A). 

6. Fix b > 1. 

(a) If m, n, p, q are integers, n > 0, q > 0, and r = mjn = p/q , prove that 

(b m ) lln = (b p y /Q . 

Hence it makes sense to define b r = ( b m ) lln . 

( b ) Prove that b r + a = b r b s if r and s are rational. 

(c) If x is real, define B(x) to be the set of all numbers b\ where t is rational and 
t <x. Prove that 

b r = sup B(r) 

when r is rational. Hence it makes sense to define 

b x = sup B(x) 

for every real x. 

(d) Prove that b x + y = b x b y fc. all real x and y. 

7. Fix b > 1 , y > 0, and prove that there is a unique real ;c such that b x = y, by 
completing the following outline. (This ;c is called the logarithm of y to the base b.) 

(a) For any positive integer n, b n — 1 ^ n(b — 1). 

(b) Hence b — 1 > n(b l,n - 1). 

(c) If t > 1 and n >(b — 1 )/(t — 1), then b Un < t. 

(i d ) If w is such that b w <y, then b w+(lln) <y for sufficiently large n; to see this, 
apply part (c) with t = y • b~ Y> . 

( e ) If b w >y, then b w ' iUn) > y for sufficiently large n. 

(/) Let A be the set of all w such that b w <y , and show that jc = sup A satisfies 
b x = y. 

(g) Prove that this x is unique. 

8. Prove that no order can be defined in the complex field that turns it into an ordered 
field. Hint: — 1 is a square. 

9. Suppose z = a + bi, w = c + di. Define z < w if a < c, and also if a = c but 
b < d. Prove that this turns the set of all complex numbers into an ordered set. 
(This type of order relation is called a dictionary order , or lexicographic order , for 
obvious reasons.) Does this ordered set have the least-upper-bound property? 

10. Suppose z = a + bi, w = u + iv 9 and 
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Prove that z 2 = w if v > 0 and that (z) 2 = w if v < 0. Conclude that every complex 
number (with one exception!) has two complex square roots. 

11. If z is a complex number, prove that there exists an r > 0 and a complex number 
w with | w\ = 1 such that z = nv. Are w and r always uniquely determined by z? 

12. If Zj, . . . , z„ are complex, prove that 

|Zi + Z2 + *** + Z b | < |Zi I + | Z 2 | 4- I- \ z n\- 

13. If x, y are complex, prove that 

1 1* I - Ml < 

14. If z is a complex number such that |z| = 1, that is, such that zz = 1, compute 

u + zi 2 + ii-zr. 

15. Under what conditions does equality hold in the Schwarz inequality? 

16. Suppose k> 3, x, y e R k , | x — y | = d > 0, and r > 0. Prove: 

(a) If 2 r > d, there are infinitely many z e R k such that 

|z-x| = |z-y| =r. 

( b ) If 2 r = d, there is exactly one such z. 

(c) If 2 r < d , there is no such z. 

How must these statements be modified if A: is 2 or 1 ? 

17. Prove that 

|x + y| J + |x — y| 2 =2|x| 2 + 2|y| 2 

if xeR k and y e R k . Interpret this geometrically, as a statement about parallel- 
ograms. 

18. If k > 2 and x e R k y prove that there exists y e R k such that y #0 but x *y = 0. 
Is this also true if k = 1 ? 

19. Suppose a e R k , b e R k . Find c e R k and r > 0 such that 

l x — a| = 2 1 x — b | 

if and only if | x — c | = r. 

0 Solution : 3c = 4b — a, 3r = 2 1 b — a | .) 

20. With reference to the Appendix, suppose that property (III) were omitted from the 
definition of a cut. Keep the same definitions of order and addition. Show that 
the resulting ordered set has the least-upper-bound property, that addition satisfies 
axioms (Al) to (A4) (with a slightly different zero-element!) but that (A5) fails. 



2 

BASIC TOPOLOGY 


FINITE, COUNTABLE, AND UNCOUNTABLE SETS 
We begin this section with a definition of the function concept. 

2.1 Definition Consider two sets A and B , whose elements may be any objects 
whatsoever, and suppose that with each element x of A there is associated, in 
some manner, an element of B , which we denote by f(x). Then /is said to be a 
function from A to B (or a mapping of A into B). The set A is called the domain 
of/ (we also say /is defined on A ), and the elements /(x) are called the values 
of / The set of all values of /is called the range of / 

2.2 Definition Let A and B be two sets and let / be a mapping of A into B. 
If E cr A,f(E) is defined to be the set of all elements /(x), for x e E. We call 
/(/: ) the image of E under / In this notation, f(A) is the range of / It is clear 
that f(A) c= B. I ff(A) — B, we say that /maps A onto B. (Note that, according 
to this usage, onto is more specific than into.) 

If E cr B, f~ l (E) denotes the set of all x e A such that /(x) e E. We call 
/ -1 ( E ) the inverse image of E under / If y e B,f~ x (y) is the set of all xe A 
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such that /(x) = y. If, for each ye B,f~'(y) consists of at most one element 
of A , then / is said to be a 1-1 ( one-to-one ) mapping of A into B This may 
also be expressed as follows: / is a 1-1 mapping of A into B provided that 
/( jcj) 7* /(x 2 ) whenever x, x 2 , x, e A , x 2 

(The notation x, =£ x 2 means that x, and x 2 are distinct elements; otherwise 
we write x, = x 2 .) 


2.3 Definition If there exists a 1-1 mapping of A onto B , we say that A and B , 
can be put in 1-1 correspondence , or that >1 and 2? have the same cardinal number , 
or briefly, that /I and 2? are equivalent , and we write A- B. This relation clearly 
has the following properties: 

It is reflexive: A ~ A. 

It is symmetric: If A ~ B. then B ~ A. 

It is transitive: If A ~ 5 and 2? ~ C, then /< - C. 

Any relation with these three properties is called an equivalence relation. 


2.4 Definition For any positive integer n , let be the set whose elements are 
the integers 1,2, ...,«; let J be the set consisting of all positive integers. For any set 
A , we say: 

(a) A is finite if A ~~J n for some n (the empty set is also considered to be 
finite). 

(b) A is infinite if A is not finite. 

(c) A is countable if A- J. 

(d) A is uncountable if A is neither finite nor countable. 

(e) A is at most countable if A is finite or countable. 

Countable sets are sometimes called enumerable , or denumerable. 

For two finite sets A and B , we evidently have A ~ B if and only if .1 and 
B contain the same number of elements. For infinite sets, however, the idea of 
“having the same number of elements" becomes quite vague, whereas the notion 
of 1-1 correspondence retains its clarity. 


2.5 Example Let A be the set of all integers. Then A is countable. For, 
consider the following arrangement of the sets A and J\ 


A: 

J: 


0, 1, -1,2, -2, 3, -3, ... 
1,2, 3, 4, 5,6,7,... 
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We can, in this example, even give an explicit formula for a function / 
from J to A which sets up a 1-1 correspondence: 


in 


2 


m = 


n — 1 


2 


(n even), 


( n odd). 


2.6 Remark A finite set cannot be equivalent to one of its proper subsets. 
That this is, however, possible for infinite sets, is shown by Example 2.5, in 
which J is a proper subset of A. 

In fact, we could replace Definition 2A(b) by the statement: A is infinite if 
A is equivalent to one of its proper subsets. 

2.7 Definition By a sequence , we mean a function / defined on the set J of all 
positive integers. If f(n) = x n , for neJ, it is customary to denote the sequence 

/by the symbol {*„}, or sometimes by x l9 x 2 , x 3 , The values of /, that is, 

the elements x n , are called the terms of the sequence. If A is a set and if x n e A 
for all neJ, then {*„} is said to be a sequence in A, or a sequence of elements of A . 

Note that the terms x it x 2 , * 3 , . . . of a sequence need not be distinct. 
Since every countable set is the range of a 1-1 function defined on J , we 
may regard every countable set as the range of a sequence of distinct terms. 
Speaking more loosely, we may say that the elements of any countable set can 
be “arranged in a sequence.” 

Sometimes it is convenient to replace J in this definition by the set of all 
nonnegative integers, i.e., to start with 0 rather than with 1. 

2.8 Theorem Every infinite subset of a countable set A is countable. 

Proof Suppose £<=/!, and E is infinite. Arrange the elements x of A in 
a sequence {*„} of distinct elements. Construct a sequence {n k } as follows: 

Let n l be the smallest positive integer such that x„ A e E. Having 
chosen n i9 . . . , n k _ t (k = 2, 3, 4, . . .), let n k be the smallest integer greater 
than n k - Y such that x hk e E. 

Putting f(k) = x„ k (k = 1, 2, 3, . . .), we obtain a 1-1 correspondence 
between E and J. 

The theorem shows that, roughly speaking, countable sets represent 
the “smallest” infinity: No uncountable set can be a subset of a countable 
set. 

2.9 Definition Let A and Q be sets, and suppose that with each element a of 
A there is associated a subset of Q which we denote by E a . 
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The set whose elements are the sets E x will be denoted by {£«}. Instead 
of speaking of sets of sets, we shall sometimes speak of a collection of sets, or 
a family of sets. 

The union of the sets £ a is defined to be the set S such that x e S if and only 
if x e E a for at least one cue A. We use the notation 

(1) U £«. 

a e A 

If A consists of the integers 1,2 , . . . , n, one usually writes 

( 2 ) 5=0 E m 

m = 1 

or 

(3) S = E x u E 2 u • • • u E n . 

If A is the set of all positive integers, the usual notation is 

(4) 5= 0 E m . 

m — 1 

The symbol oo in (4) merely indicates that the union of a countable col- 
lection of sets is taken, and should not be confused with the symbols + oo, — oo, 
introduced in Definition 1.23. 

The intersection of the sets E a is defined to be the set P such that x e P if 
and only if x e E a for every a e A. We use the notation 

( 5 ) P=C)E., 

a e A 

or 

(6) P = n Pm= Et n £ 2 n • • • n E „ , 

m t 

or 

( 7 ) p = n E m , 

m= 1 

as for unions. If A nB is not empty, we say that A and B intersect ; otherwise they 
are a sjoint. 

2.10 Examples 

(a) Suppose E consists of 1, 2, 3 and E 2 consists of 2, 3, 4. Then 
£, u E 2 consists of 1, 2, 3, 4, whereas E x n E 2 consists of 2, 3. 
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(b) Let A be the set of real numbers x such that 0 < x < 1 . For every 
x e A, let E x be the set of real numbers y such that 0 < y < x. Then 

(i) E x <= E z if and only ifO<jc<z<l; 

(JO U £« = £.; 

x e A 

(iii) f) E * is empty; 

xe A 

(i) and (ii) are clear. To prove (iii), we note that for every y > 0, y $ E x 
if x < y. Hence y $C\ xeA E x . 

2.11 Remarks Many properties of unions and intersections are quite similar 
to those of sums and products, in fact, the words sum and product were some- 
times used in this connection, and the symbols I and FI were written in place 
of (J and P). 

The commutative and associative laws are trivial: 

(8) A u B = B u A ', A n B = B n A. 

(9) (A v B)u C = A v (B v C); (A n B) n C = A n (B n C). 

Thus the omission of parentheses in (3) and (6) is justified. 

The distributive law also holds: 

(10) A n(BuC)=(A n B) u (A n C). 

To prove this, let the left and right members of (10) be denoted by E and F, 
respectively. 

Suppose xe E. Then xe A and x e B u C, that is, x e B or x e C (pos- 
sibly both). Hence xe A n B or x e A n C, so that x e F. Thus E cz F. 

Next, suppose xe F. Then xeAr\B or xeA n C. That is, xe A, and 
xe B u C. Hence xe A n (B u C), so that F <= E. 

It follows that E = F. 

We list a few more relations which are easily verified: 

(11) A <= A u 2?, 

(12) A n B cz A. 

If 0 denotes the empty set, then 

(13) A uO =A, An 0=0. 

If A cz B, then 

(14) Au B = B, 


A n B = A. 
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2.12 Theorem Let {£„}, n = 1, 2, 3, . . . , be a sequence of countable sets, andput 
(15) S=IK. 

n = 1 

Then S is countable. 

Proof Let every set E n be arranged in a sequence {x nk } 9 k = 1, 2, 3, ... , 
and consider the infinite array 


(16) 



in which the elements of E n form the nth row. The array contains all 
elements of S. As indicated by the arrows, these elements can be 
arranged in a sequence 

( 17 ) X n ; X 2 1 , x l2 l x 3l , X 2 2 i X 13> x 4l 1 x 32 » *23 » *1 4 > • • • 

If any two of the sets E n have elements in common, these will appear more 
than once in (17). Hence there is a subset T of the set of all positive 
integers such that S ~ T, which shows that S is at most countable 
(Theorem 2.8). Since E { <= S, and £, is infinite, S is infinite, and thus 
countable. 

Corollary Suppose A is at most countable , and , for every oce A, B a is at most 
countable. Put 

t - u*- 

a e A 

Then T is at most countable. 

For T is equivalent to a subset of (15). 

2.13 Theorem Let A be a countable set , and let B n be the set of all n-tuples 
(a ! , . . . , a n ), where a k e A {k = 1 , . . . , n), and the elements a x , . . . , a„ need not be 
distinct. Then B n is countable. 

Proof That B { is countable is evident, since B t = A. Suppose B n _ v is 
countable (n = 2, 3, 4, . . .). The elements of B n are of the form 

(18) (b, a) (be B n _ i9 ae A). 

For every fixed b , the set of pairs (b, a) is equivalent to A , and hence 
countable. Thus B n is the union of a countable set of countable sets. By 
Theorem 2.12, B n is countable. 

The theorem follows by induction. 
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Corollary The set of all rational numbers is countable. 

Proof We apply Theorem 2.13, with n = 2, noting that every rational r 
is of the form b\a , where a and b are integers. The set of pairs ( a , b), and 
therefore the set of fractions b/a, is countable. 

In fact, even the set of all algebraic numbers is countable (see Exer- 
cise 2). 

That not all infinite sets are, however, countable, is shown by the next 
theorem. 

2.14 Theorem Let A be the set of all sequences whose elements are the digits 0 
and 1 . This set A is uncountable. 

The elements of A are sequences like 1, 0, 0, 1,0, 1, 1, 1, 

Proof Let £ be a countable subset of A , and let E consist of the se- 
quences s { , s 2t s 3 , We construct a sequence s as follows. If the nth 

digit in s n is 1, we let the nth digit of s be 0, and vice versa. Then the 
sequence s differs from every member of E in at least one place; hence 
s $ E. But clearly s e A, so that £ is a proper subset of A. 

We have shown that every countable subset of A is a proper subset 
of A. It follows that A is uncountable (for otherwise A would be a proper 
subset of A , which is absurd). 

The idea of the above proof was first used by Cantor, and is called Cantor’s 
diagonal process; for, if the sequences s l9 s 2 , s 3 , ... are placed in an array like 
(16), it is the elements on the diagonal which are involved in the construction of 
the new sequence. 

Readers who are familiar with the binary representation of the real 
numbers (base 2 instead of 10) will notice that Theorem 2.14 implies that the 
set of all real numbers is uncountable. We shall give a second proof of this 
fact in Theorem 2.43. 

METRIC SPACES 

2.15 Definition A set X , whose elements we shall call points , is said to be a 
metric space if with any two points p and q of X there is associated a real 
number d(p, q), called the distance from p to q , such that 

(a) d(p, q) > 0 if p ^ q\ d(p , p) = 0; 

(b) d(p, q) = d(q, p ) ; 

(c) d(p, q) < d(p, r) + d(r, q), for any r e X. 

Any function with these three properties is called a distance function , or 
a metric. 



BASIC TOPOLOGY 31 


2.16 Examples The most important examples of metric spaces, from our 
standpoint, are the euclidean spaces R k , especially R l (the real line) and R 2 (the 
complex plane); the distance in R k is defined by 

(19) d(x, y) = |x - y| (x, y e R k ). 

By Theorem 1.37, the conditions of Definition 2.15 are satisfied by (19). 

It is important to observe that every subset Fof a metric space A' is a metric 
space in its own right, with the same distance function. For it is clear that if 
conditions (a) to (c) of Definition 2.15 hold for p, q.r e A, they also hold if we 
restrict p, q , r to lie in Y. 

Thus every subset of a euclidean space is a metric space. Other examples 
are the spaces #( K ) and S£ 2 {p), which are discussed in Chaps. 7 and 1 1, respec- 
tively. 


2.17 Definition By the segment ( a , b) we mean the set of all real numbers x 
such that a < x <b. 

By the interval [a, b\ we mean the set of all real numbers x such that 
a < x < b. 

Occasionally we shall also encounter “half-open intervals” [<7, b) and ( a , b] ; 
the first consists of all x such that a < x < b, the second of all x such that 
a < x < b. 

If a t < b t for k, the set of all points x = (x,. . . . , x k ) in R k whose 

coordinates satisfy the inequalities a t < x, < b { (1 < / < k) is called a k-cell. 
Thus a 1-cell is an interval, a 2-cell is a rectangle, etc. 

If x g R k and r > 0, the open (or dosed) ball B with center at x and radius r 
is defined to be the set of all y e R k such that | y — x | < r (or | y - x | < r). 

We call a set E c= R k convex if 

Ax + (1 - A)y g E 

whenever x e £, y e E, and 0 < A < 1 . 

For example, balls are convex. For if | y — x | < r, | z — x | < r, and 
0 < A < 1, we have 

| Ay + (I - A)z - x| = |A(y - x) + (I - A)(z - x)| 

< A|y - x| + (1 — A) | z — x | < Xr + (1 — X)r 
= r. 

The same proof applies to closed balls. It is also easy to see that Ar-cells are 
convex. 
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2.18 Definition Let X be a metric space. All points and sets mentioned below 
are understood to be elements and subsets of X. 

(a) A neighborhood of a point p is a set N r (p) consisting of all points q 
such that d(p, q) < r. The number r is called the radius of N r (p). 

(b) A point p is a limit point of the set E if every neighborhood of p 
contains a point q # p such that q e E. 

(c) If pe E and p is not a limit point of E, then p is called an isolated 
point of E. 

(d) E is closed if every limit point of E is a point of E. 

(e) A point p is an interior point of E if there is a neighborhood N of p 
such that N <z E. 

if) E is open if every point of E is an interior point of E. 

(g) The complement of E (denoted by E c ) is the set of all points pe X 
such that p $ E. 

(h) E is perfect if E is closed and if every point of E is a limit point 
of E. 

(i) E is bounded if there is a real number M and a point qeX such that 
d(p, q) < M for all pe E. 

O') E is dense in X if every point of A" is a limit point of E, or a point of 
E (or both). 

Let us note that in R 1 neighborhoods are segments, whereas in R 2 neigh- 
borhoods are interiors of circles. 

2.19 Theorem Every neighborhood is an open set. 

Proof Consider a neighborhood E = N r (p), and let q be any point of E. 

Then there is a positive real number h such that 

d(p, q)=r -h. 

For all points s such that d(q , s) < h, we have then 

d(jp , s) < d(p, q) + d(q,s)<r-h + h= r , 

so that se E. Thus q is an interior point of E. 

2.20 Theorem If p is a limit point of a set E, then every neighborhood of p 
contains infinitely many points of E. 

Proof Suppose there is a neighborhood N of p which contains only a 

finite number of points of E. Let q t , . . . , q n be those points of N n E, 

which are distinct from p, and put 

r = min d(p, q m ) 
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[we use this notation to denote the smallest of the numbers </(/?, <7, ), . . . , 
d(p,q , ,)]. The minimum of a finite set of positive numbers is clearly posi- 
tive, so that r > 0. 

The neighborhood N r (p) contains no point q of E such that q # /?, 
so that p is not a limit point of £. This contradiction establishes the 
theorem. 

Corollary A finite point set has no limit points. 

2.21 Examples Let us consider the following subsets of R 2 : 

(a) The set of all complex z such that |z| < 1. 

(£) The set of all complex z such that |z| < 1. 

(c) A finite set. 

( d ) The set of all integers. 

(e) The set consisting of the numbers \/n (n = 1, 2, 3, . . .). Let us note 
that this set E has a limit point (namely, z = 0) but that no point of E is 
a limit point of £; we wish to stress the difference between having a limit 
point and containing one. 

(/) The set of all complex numbers (that is, R 2 ). 

(g) The segment (a, b). 

Let us note that ( d ), (e), ( g ) can be regarded also as subsets of R l . 

Some properties of these sets are tabulated below: 



Closed 

Open 

Perfect 

Bounded 

{a) 

No 

Yes 

No 

Yes 

(b) 

Yes 

No 

Yes 

Yes 

(c) 

Yes 

No 

No 

Yes 

id) 

Yes 

No 

No 

No 

(e) 

No 

No 

No 

Yes 

if) 

Yes 

Yes 

Yes 

No 

id) 

No 


No 

Yes 


In ( g ), we left the second entry blank. The reason is that the segment 
(a, b) is not open if we regard it as a subset of R 2 , but it is an open subset of R l . 

2.22 Theorem Let {£ a } be a ( finite or infinite) collection of sets £ a . Then 

(20) (u = n (£;)• 

Proof Let A and B be the left and right members of (20). I f x e A, then 
x $ |J a E a , hence x $ E a for any a, hence xe E c a for every a, so that x e f| ££ • 
Thus A <= B. 
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Conversely, if x e B, then x e E c a for every a, hence x ^ E a for any a, 
hence x $ Ua E a , so that x e ( Ua F a ) c . Thus B <= A. 

It follows that A = B. 

2.23 Theorem A set E is open if and only if its complement is closed. 

Proof First, suppose E ? is closed. Choose x e E. Then x $ E c , and x is 
not a limit point of E c . Hence there exists a neighborhood N of x such 
that E c n N is empty, that is, N c E. Thus x is an interior point of E, 
and E is open. 

Next, suppose E is open. Let x be a limit point of E c . Then every 
neighborhood of x contains a point of E c , so that x is not an interior point 
of E. Since E is open, this means that x e E c . It follows that E c is closed. 

Corollary A set F is closed if and only if its complement is open. 

2.24 Theorem 

(a) For any collection {G a | of open sets, Ua G a is open. 

(b) For any collection {F a } of closed sets , f] a F a is closed. 

(c) For any finite collection G Y , ... ,G n of open sets , f|?= i C, is open . 

(d) For any finite collection F x , . . . , F n of closed sets, U”= i F t is closed. 

Proof Put G = Ua G a . If x e G, then x e G a for some a. Since x is an 
interior point of G a , x is also an interior point of G, and G is open. This 
proves (a). 

By Theorem 2.22, 

(2i) (fi ^«) c = u (to, 

and F£ is open, by Theorem 2.23. Hence (a) implies that (21) is open so 
that n« is closed. 

Next, put H = n?.i C, . For any x e H, there exist neighborhoods 
Ni of x, with radii r f , such that N i czG i (i = \,..., n). Put 

r = min (r x ,...,r n ), 

and let N be the neighborhood of x of radius r. Then N c= G, for / = 1, 

. . . , n, so that N a H, and H is open. 

By taking complements, (d) follows from (c): 

( n \c n 

u/i) = pkto 
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2.25 Examples In parts (c) and ( d ) of the preceding theorem, the finiteness of 

the collections is essential. For let G n be the segment ( — - , (n = 1, 2, 3, . . 

\ n n l 

Then G n is an open subset of R l . Put G = 0^=1 • Then G consists of a single 

point (namely, x = 0) and is therefore not an open subset of R 1 . 

Thus the intersection of an infinite collection of open sets need not be open. 
Similarly, the union of an infinite collection of closed sets need not be closed. 

2.26 Definition If A" is a metric space, if E c= X, and if E' denotes the set of 
all limit points of E in X , then the closure of E is the set E = E u E'. 

2.27 Theorem If X is a metric space and E <= X, then 

(a) E is closed , 

(b) E — E if and only if E is closed , 

(i c ) E <= F for every closed set F <= X such that E <= F. 

By (i a ) and (c), E is the smallest closed subset of X that contains E. 

Proof 

(a) If p e X and p $ E then p is neither a point of E nor a limit point of E. 
Hence p has a neighborhood which does not intersect E. The complement 
of E is therefore open. Hence E is closed. 

(b) If E = E, (a) implies that E is closed. If E is closed, then E' cz E 
[by Definitions 2.18 {d) and 2.26], hence E = E. 

(< c ) If F is closed and F £, then F => F\ hence F E'. Thus F => E. 

2.28 Theorem Let E be a nonempty set of real numbers which is bounded above. 
Let y = sup E. Then y e E. Hence y e E if E is closed. 

Compare this with the examples in Sec. 1.9. 

Proof If y e E then y e E. Assume y $ E. For every h > 0 there exists 
then a point xe E such that y — h < x < y, for otherwise y — h would be 
an upper bound of E. Thus y is a limit point of E. Hence y e E. 

2.29 Remark Suppose E c= Y <= X, where X is a metric space. To say that E 
is an open subset of X means that to each point pe E there is associated a 
positive number r such that the conditions d(p, q) < r, q e X imply that q e E. 
But we have already observed (Sec. 2. 1 6) that Y is also a metric space, so that 
our definitions may equally well be made within Y. To be quite explicit, let us 
say that E is open relative to Y if to each pe E there is associated an r > 0 such 
that qe E whenever d(p t q) < r and q e Y. Example 2.21(#) showed that a set 
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may be open relative to Y without being an open subset of X. However, there 
is a simple relation between these concepts, which we now state. 

2.30 Theorem Suppose Y c= X. A subset E of Y is open relative to Y if and 
only if E = Y n G for some open subset G of X. 

Proof Suppose E is open relative to Y. To each p e E there is a positive 
number r p such that the conditions d(p , q) < r p , q e Y imply that q e E. 
Let V p be the set of all q e X such that d(p, q) < r p , and define 

g= \jy P - 

p e £ 

Then G is an open subset of X , by Theorems 2.19 and 2.24. 

Since p e V p for all p e E, it is clear that £c(?n Y. 

By our choice of V p , we have V p n Y <= E for every p e E, so that 
G n fc£ Thus E = G n Y , and one half of the theorem is proved. 

Conversely, if G is open in X and E = G n Y, every p e E has a 
neighborhood V p <= G. Then V p n Y a E, so that E is open relative to T. 


COMPACT SETS 

2.31 Definition By an open cover of a set £ in a metric space X we mean a 
collection {GJ of open subsets of X such that E <= (J<x 

2.32 Definition A subset K of a metric space A" is said to be compact if every 
open cover of K contains a finite subcover. 

More explicitly, the requirement is that if {GJ is an open cover of K, then 
there are finitely many indices oq, . . . , a„ such that 

K a G aj u • • • u G an . 

The notion of compactness is of great importance in analysis, especially 
in connection with continuity (Chap. 4). 

It is clear that every finite set is compact. The existence of a large class of 
infinite compact sets in R k will follow from Theorem 2.41. 

We observed earlier (in Sec. 2.29) that if E <= Y a X, then E may be open 
relative to Y without being open relative to X. The property of being open thus 
depends on the space in which E is embedded. The same is true of the property 
of being closed. 

Compactness, however, behaves better, as we shall now see. To formu- 
late the next theorem, let us say, temporarily, that K is compact relative to A" if 
the requirements of Definition 2.32 are met. 
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2.33 Theorem Suppose K cr yd Then K is compact relative to X if and 
only if K is compact relative to Y. 

By virtue of this theorem we are able, in many situations, to regard com- 
pact sets as metric spaces in their own right, without paying any attention to 
any embedding space. In particular, although it makes little sense to talk of 
open spaces, or of closed spaces (every metric space X is an open subset of itself, 
and is a closed subset of itself), it does make sense to talk of compact metric 
spaces. 

Proof Suppose K is compact relative to X, and let {V a } be a collection 
of sets, open relative to Y , such that K c= (J« K- By theorem 2.30, there 
are sets G a , open relative to X y such that V a = Y n G a , for all a; and since 
K is compact relative to X , we have 

( 22 ) K<zG ai u-uG Jn 

for some choice of finitely many indices a l5 ..., a„. Since K a Y, (22) 
implies 

(23) Kc=V ai u---uV an . 

This proves that K is compact relative to Y. 

Conversely, suppose K is compact relative to Y , let {GJ be a col- 
lection of open subsets of X which covers K , and put V a = Y n G a . Then 
(23) will hold for some choice of and since K a cG a , (23) 

implies (22). 

This completes the proof. 

2.34 Theorem Compact subsets of metric spaces are closed . 

Proof Let A' be a compact subset of a metric space X. We shall prove 
that the complement of K is an open subset of X. 

Suppose p e X, p $ K. If q e K, let V q and W q be neighborhoods of p 
and q , respectively, of radius less than \ d(p,q) [see Definition 2.18(a)]. 
Since K is compact, there are finitely many points q x% . . . , q n in K such that 

K c W Q . u • • • u W Q = W. 

H 1 Hn 

If V = V qx n • • • n V qn , then V is a neighborhood of p which does not 
intersect W. Hence V c= A c , so that p is an interior point of K c . The 
theorem follows. 

2.35 Theorem Closed subsets of compact sets are compact . 

Proof Suppose F c= K c X 9 F is closed (relative to X), and K is compact. 
Let {K a } be an open cover of F, If F c is adjoined to {V a }, we obtain an 
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open cover Q of K. Since K is compact, there is a finite subcollection <I> 
of Q which covers K, and hence F. If F c is a member of O, we may remove 
it from O and still retain an open cover of F. We have thus shown that a 
finite subcollection of {V a } covers F. 

Corollary If F is closed and K is compact , then Fn K is compact. 

Proof Theorems 2.24(6) and 2.34 show that FnK is closed; since 
F n K <= K, Theorem 2.35 shows that F n K is compact. 

2.36 Theorem If{K a } is a collection of compact subsets of a metric space X such 
that the intersection of every finite subcollection of { K *} is nonempty , then f| K a 
is nonempty. 

Proof Fix a member K t of {K a } and put G a = K£ . Assume that no point 
of K x belongs to every K a . Then the sets G a form an open cover of K x ; 
and since K x is compact, there are finitely many indices a 1? . . ., a„ such 
that K x G ai u • • • u G an . But this means that 

K x n K ai n • • • n K an 

is empty, in contradiction to our hypothesis. 

Corollary If {K n } is a sequence of nonempty compact sets such that K n => K n + X 
(n = 1, 2, 3, . . .), then f|i° K n is not empty. 

2.31 Theorem If E is an infinite subset of a compact set K , then E has a limit 
point in K. 

Proof If no point of K were a limit point of £, then each q e K would 
have a neighborhood V q which contains at most one point of £ (namely, 
q, if qe £). It is clear that no finite subcollection of {F q } can cover £; 
and the same is true of K , since £ K. This contradicts the compactness 
of K. 

2.38 Theorem If {Q is a sequence of intervals in R\ such that /„ => / n + 1 
(n = 1, 2, 3, . . .), then flT I n IS not empty. 

Proof If /„ = [#„, b n ], let £ be the set of all a n . Then £ is nonempty and 
bounded above (by b x ). Let x be the sup of £. If m and n are positive 
integers, then 

a n — a m + n — b m + „ <b m , 

so that x < b m for each m. Since it is obvious that a m < x, we see that 
x e I m for m = 1, 2, 3, 
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2.39 Theorem Let k be a positive integer. If {/„} is a sequence of k-cells such 
that /„ ^ I n + i( n = 1 , 2, 3, . . .), then f | T h * s not empty. 

Proof Let I n consist of all points x = (x u . . . , x*) such that 

a nJ < Xj < b nJ (1 <j < k; n = 1, 2, 3, . . .), 

and put I n j = [a n Jt b nJ ]. For each j, the sequence {I nJ } satisfies the 
hypotheses of Theorem 2.38. Hence there are real numbers x*(l <j<k) 
such that 

a nJ <x* < b nJ (1 <j < k\ n = 1, 2, 3, . . .). 

Setting x* = (x* y . . . , x*), we see that x* e I„ for n = 1, 2, 3, The 

theorem follows. 

2.40 Theorem Every k-cell is compact. 

Proof Let / be a Ar-cell, consisting of all points x = (x it . . . , x k ) such 
that aj <Xj < bj ( 1 < / < k). Put 



Then | x — y | < <5, if x e I, y e /. 

Suppose, to get a contradiction, that there exists an open cover {G a } 
of / which contains no finite subcover of I. Put Cj = {a } -f bj)l 2. The 
intervals [a jf Cj] and [c jy bj\ then determine 2 k A:-cells Q t whose union is /. 
At least one of these sets Q if call it I l9 cannot be covered by any finite 
subcollection of {G a } (otherwise / could be so covered). We next subdivide 
f and continue the process. We obtain a sequence {/„} with the following 
properties : 

(a) /d/,d/ 2 d/ 3 d‘*-; 

(b) I n is not covered by any finite subcollection of {G a }; 

(c) if x e /„ and y e /„, then |x — y | < 2 ~ n S. 

By ( a ) and Theorem 2.39, there is a point x* which lies in every /„. 
For some a, x*e(7 a . Since G a is open, there exists r>0 such that 
|y — x* | < r implies that y e G a . If n is so large that 2~ n S < r (there is 
such an n , for otherwise 2 n < S/r for all positive integers n , which is 
absurd since R is archimedean), then (c) implies that /„ <= G a , which con- 
tradicts (b). 

This completes the proof. 

The equivalence of (a) and ( b ) in the next theorem is known as the Heine- 
Borel theorem. 
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2.41 Theorem If a set E in R k has one of the following three properties , then it 
has the other two: 

(a) E is closed and bounded. 

(b) E is compact. 

(c) Every infinite subset of E has a limit point in E. 

Proof If (a) holds, then E c I for some A:-cell /, and ( b ) follows from 
Theorems 2.40 and 2.35. Theorem 2.37 shows that ( b ) implies (c). It 
remains to be shown that (c) implies (a). 

If E is not bounded, then E contains points x„ with 

|x„| > n (w = 1,2, 3, ...)• 

The set S consisting of these points x„ is infinite and clearly has no limit 
point in R k , hence has none in E. Thus (c) implies that E is bounded. 

If E is not closed, then there is a point x 0 e R k which is a limit point 
of E but not a point of E. For n = 1, 2, 3, ... , there are points x„ e E 
such that |x„ — x 0 | <1//?. Let S be the set of these points x„ . Then S is 
infinite (otherwise |x„ — x 0 | would have a constant positive value, for 
infinitely many «), 5 has x 0 as a limit point, and S has no other limit 
point in R k . For if y e R k , y # x 0 , then 


I*. -y| ^ l*o -y| - l*„-*ol 

, , 1 1 , 

> |x„-y| -->-|x 0 -y| 

for all but finitely many n\ this shows that y is not a limit point of S 
(Theorem 2.20). 

Thus S has no limit point in E\ hence E must be closed if ( c ) holds. 

We should remark, at this point, that ( b ) and (c) are equivalent in any 
metric space (Exercise 26) but that (a) does not, in general, imply ( b ) and (c). 
Examples are furnished by Exercise 16 and by the space if 2 , which is dis- 
cussed in Chap. 11. 

2.42 Theorem (Weierstrass) Every bounded infinite subset of R k has a limit 
point in R k . 

Proof Being bounded, the set E in question is a subset of a Ar-cell / c= R k , 
By Theorem 2.40, / is compact, and so E has a limit point in /, by 
Theorem 2.37. 
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PERFECT SETS 

2.43 Theorem Let P be a nonempty perfect set in R k . Then P is uncountable. 

Proof Since P has limit points, P must be infinite. Suppose P is count- 
able, and denote the points of P by x 1 ,x 2 , x 3 , We shall construct a 

sequence {V n } of neighborhoods, as follows. 

Let V x be any neighborhood of x P If V x consists of all y e R k such 
that | y — \ { | < r, the closure V x of V x is the set of all y e R k such that 
|y -Xil < r. 

Suppose V n has been constructed, so that V n n P is not empty. Since 
every point of P is a limit point of P, there is a neighborhood V n + 1 such 
that (i) F n + 1 <= V" , (ii) x„£ F n + 1 , (iii) F n+1 n P is not empty. By (iii), 
V n + l satisfies our induction hypothesis, and the construction can proceed. 

Put K n = V n n P. Since V n is closed and bounded, V n is compact. 
Since x n $ K n + l9 no point of P lies in f|f K n - Since K n c= P, this implies 
that f|f K n is empty. But each K n is nonempty, by (iii), and K n zd K„ + 1 , 
by (i); this contradicts the Corollary to Theorem 2.36. 

Corollary Every interval [ a , b] (a < b) is uncountable. In particular , the set of 
all real numbers is uncountable. 

2.44 The Cantor set The set which we are now going to construct shows 
that there exist perfect sets in R l which contain no segment. 

Let E 0 be the interval [0, 1], Remove the segment (^, §), and let E x be 
the union of the intervals 

[<U] [f, i]. 

Remove the middle thirds of these intervals, and let E 2 be the union of the 
intervals 


[o.iui.iui.n. it. ii- 

Continuing in this way, we obtain a sequence of compact sets E n , such that 

(a) E x =) E 2 zd E 3 d • • • ; 

( b ) E n is the union of 2" intervals, each of length 3 -n . 

The set 


00 

P=f]En 


n ~ 1 


is called the Cantor set. P is clearly compact, and Theorem 2.36 shows that P 
is not empty. 
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No segment of the form 


(24) 


(3k + 1 3k + 2\ 

\ 3 m ’ 3 m V 


where k and m are positive integers, has a point in common with P. Since every 
segment (a, /?) contains a segment of the form (24), if 


P contains no segment. 

To show that P is perfect, it is enough to show that P contains no isolated 
point. Let x e P, and let S be any segment containing x. Let /„ be that interval 
of E n which contains x. Choose n large enough, so that /„ c= S. Let x„ be an 
endpoint of /„ , such that x n ^ x. 

It follows from the construction of P that x„ e P. Hence x is a limit point 
of P , and P is perfect. 

One of the most interesting properties of the Cantor set is that it provides 
us with an example of an uncountable set of measure zero (the concept of 
measure will be discussed in Chap. 1 1). 


CONNECTED SETS 

2.45 Definition Two subsets A and B of a metric space X are said to be 
separated if both A n B and A n B are empty, i.e., if no point of A lies in the 
closure of B and no point of B lies in the closure of A. 

A set E a X is said to be connected if E is not a union of two nonempty 
separated sets. 

2.46 Remark Separated sets are of course disjoint, but disjoint sets need not 
be separated. For example, the interval [0, 1] and the segment (1,2) are not 
separated, since 1 is a limit point of (1, 2). However, the segments (0, 1) and 
(1,2) are separated. 

The connected subsets of the line have a particularly simple structure: 

2.47 Theorem A subset E of the real line R l is connected if and only if it has the 
following property: If x e E, y e E, and x < z < y, then z e E. 

Proof If there exist x e E, y e E, and some z e (x, y) such that z £ £, then 
E = A z u B z where 

A z = E n (— oo, z), B z = E n (z, oo). 
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Since x e A z and yeB z ,A and B are nonempty. Since A z a ( — oo, z) and 
B z a (z, oo ), they are separated. Hence E is not connected. 

To prove the converse, suppose E is not connected. Then there are 
nonempty separated sets A and B such that A u B = E. Pick x e A, ye B, 
and assume (without loss of generality) that x < y. Define 

z = sup (A n [x, y]). 

By Theorem 2.28, z e A ; hence z $ B. In particular, x < z < y. 

If z$ A, it follows that x < z < y and z $ E. 

If z e A, then z £ B, hence there exists z x such that z < z x < y and 
Zj £ B. Then x < z t < y and z t $ E. 


EXERCISES 

1. Prove that the empty set is a subset of every set 

2. A complex number z is said to be algebraic if there are integers a 0 , . . . , a„, not all 
zero, such that 

a 0 z” + a l z n ~ l H + a n - y z + a„ = 0. 

Prove that the set of all algebraic numbers is countable. Hint: For every positive 
integer N there are only finitely many equations with 

n | tfo | + |0i| -I-- b |0«| = N. 

3. Prove that there exist real numbers which are not algebraic. 

4 . Is the set of all irrational real numbers countable ? 

5. Construct a bounded set of real numbers with exactly three limit points. 

6. Let E ' be the set of all limit points of a set E. Prove that E' is closed. Prove that 
E and E have the same limit points. (Recall that E = E u E\) Do E and E' always 
have the same limit points? 

7. Let A t , A 2 , A 3 , ... be subsets of a metric space. 

(a) .If B n = (Jr-i At , prove that B„ = (J t "=i A t , for n = 1, 2, 3, . . . . 

( b ) If B = (J<°=i At , prove that B => (J,°° =1 A ( . 

Show, by an example, that this inclusion can be proper. 

8. Is every point of every open set E R 2 a limit point of El Answer the same 
question for closed sets in R 2 . 

9. Let E° denote the set of all interior points of a set E. [See Definition 2.18(e); 
E° is called the interior of E.] 

(a) Prove that E° is always open. 

(b) Prove that E is open if and only if E° = E. 

(c) If G <= E and G is open, prove that G <= E°. 

( d ) Prove that the complement of E° is the closure of the complement of E. 

(e) Do E and E always have the same interiors? 

(/) Do E and E° always have the same closures? 
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10. Let A" be an infinite set. For p e X and q e X, define 


d(p, q) 


(if p = q). 


Prove that this is a metric. Which subsets of the resulting metric space are open? 
Which are closed ? Which are compact ? 

11. For x e R 1 and y e R l , define 


di(x, y) = (x- yY , 
di(x,y) = V\x-y\, 

d 3 (x,y) = \x*-y>\, 

d*(x,y) =\x— 2y\, 


ds(x, .y) 


1 +\x-y\ ‘ 


Determine, for each of these, whether it is a metric or not. 

12. Let K<^ R 1 consist of 0 and the numbers l//i, for n = 1, 2, 3, . . . . Prove that K is 
compact directly from the definition (without using the Heine-Borel theorem). 

13. Construct a compact set of real numbers whose limit points form a countable set. 

14. Give an example of an open cover of the segment (0, 1) which has no finite sub- 
cover. 

15. Show that Theorem 2.36 and its Corollary become false (in R\ for example) if the 
word “compact” is replaced by “closed” or by “bounded.” 

16. Regard Q, the set of all rational numbers, as a metric space, with d(p , q) = \p — q \ . 
Let E be the set of all p e Q such that 2 < p 2 < 3. Show that E is closed and 
bounded in Q , but that E is not compact. Is E open in Q? 

17. Let E be the set of all x e [0, 1] whose decimal expansion contains only the digits 
4 and 7. Is E countable? Is E dense in [0, 1]? Is E compact? Is E perfect? 

18. Is there a nonempty perfect set in R l which contains no rational number? 

19. (a) If A and B are disjoint closed sets in some metric space X , prove that they 
are separated. 

( b ) Prove the same for disjoint open sets. 

(c) Fix pe X, & > 0, define A to be the set of all q e X for which d(p, q) < 8, define 
B similarly, with > in place of <. Prove that A and B are separated. 

( d ) Prove that every connected metric space with at least two points is uncount- 
able. Hint: Use (c). 

20. Are closures and interiors of connected sets always connected ? (Look at subsets 
of/? 2 .) 

21. Let A and B be separated subsets of some R k , suppose a e A, be B, and define 

p(0 = (1 - 0* + tb 

for teR\ Put A 0 =p~ l (A), B 0 =p ~\B). [Thus te A 0 if and only if p(t)e A.] 
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(a) Prove that A 0 and B 0 are separated subsets of R l . 

(b) Prove that there exists t 0 e (0, 1) such that p (t 0 ) $ A u B. 

(c) Prove that every convex subset of R k is connected. 

22. A metric space is called separable if it contains a countable dense subset. Show 
that R k is separable. Hint: Consider the set of points which have only rational 
coordinates. 

23. A collection { V a } of open subsets of X is said to be a base for X if the following 
is true : For every x e X and every open set G ^ X such that x e G, we have 
xe V a c G for some a. In other words, every open set in X is the union of a 
subcollection of { V a }. 

Prove that every separable metric space has a countable base. Hint: Take 
all neighborhoods with rational radius and center in some countable dense subset 
of X. 

24. Let A" be a metric space in which every infinite subset has a limit point. Prove that 
X is separable. Hint: Fix 8 > 0, and pick x x e X. Having chosen x l9 . . . , Xj e X y 
choose Xj + i e X, if possible, so that d(x ti x J + i)> 8 for / = 1, ...,/. Show that 
this process must stop after a finite number of steps, and that X can therefore be 
covered by finitely many neighborhoods of radius 8. Take 8 = 1 jn (n = 1 , 2 , 3 ,.. .), 
and consider the centers of the corresponding neighborhoods. 

25. Prove that every compact metric space K has a countable base, and that K is 
therefore separable. Hint: For every positive integer /;, there are finitely many 
neighborhoods of radius \/n whose union covers K. 

26. Let A' be a metric space in which every infinite subset has a limit point. Prove 
that X is compact. Hint: By Exercises 23 and 24 , X has a countable base. It 
follows that every open cover of X has a countable subcover {G„}, // = 1 , 2 , 3 , . . . . 
If no finite subcollection of {G„} covers A', then the complement of G, u • • • u G„ 
is nonempty for each //, but F* > s empty. If £ is a set which contains a point 
from each £„, consider a limit point of £, and obtain a contradiction. 

27. Define a point p in a metric space A" to be a condensation point of a set E <= X if 
every neighborhood of p contains uncountably many points of E. 

Suppose E <= R k y E is uncountable, and let P be the set of all condensation 
points of E. Prove that P is perfect and that at most countably many points of E 
are not in P. In other words, show that P c r\E is at most countable. Hint: Let 
{ V „ } be a countable base of R k , let W be the union of those V n for which E n V„ 
is at most countable, and show that P = W c . 

28. Prove that every closed set in a separable metric space is the union of a (possibly 
empty) perfect set and a set which is at most countable. ( Corollary : Every count- 
able closed set in R k has isolated points.) Hint: Use Exercise 27 . 

29. Prove that every open set in R l is the union of an at most countable collection of 
disjoint segments. Hint: Use Exercise 22 . 
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30 . Imitate the proof of Theorem 2.43 to obtain the following result : 

If R k = (JfF,,, where each F„ is a closed subset of R k , then at least one F n 
has a nonempty interior. 

Equivalent statement: If G„ is a dense open subset of R k , for n = 1, 2, 3, . . . , 
then (]7G m is not empty (in fact, it is dense in R k ). 

(This is a special case of Baire’s theorem; see Exercise 22, Chap. 3, for the genera! 
case.) 



3 

NUMERICAL SEQUENCES AND SERIES 


As the title indicates, this chapter will deal primarily with sequences and series 
of complex numbers. The basic facts about convergence, however, are just as 
easily explained in a more general setting. The first three sections will therefore 
be concerned with sequences in euclidean spaces, or even in metric spaces. 


CONVERGENT SEQUENCES 

3.1 Definition A sequence { p n } in a metric space X is said to converge if there 
is a point p e X with the following property: For every e > 0 there is an integer 
N such that n > N implies that d(p n ,/?)<£. (Here d denotes the distance in X.) 

In this case we also say that {/?„} converges to /?, or that p is the limit of 
{/?„} [see Theorem 3.2(6)], and we write p n -+p 9 or 

lim p n = p. 


If {/?„} does not converge, it is said to diverge. 
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It might be well to point out that our definition of “convergent sequence” 
depends not only on { p n } but also on X; for instance, the sequence {1 /n) con- 
verges in R 1 (to 0), but fails to converge in the set of all positive real numbers 
[with d(x, y) = \x—y\]. In cases of possible ambiguity, we can be more 
precise and specify “convergent in X ” rather than “convergent.” 

We recall that the set of all points p n (n = 1 , 2, 3, . . .) is the range of {/?„}. 
The range of a sequence may be a finite set, or it may be infinite. The sequence 
{/?„} is said to be bounded if its range is bounded. 

As examples, consider the following sequences of complex numbers 
(that is, X = R 2 ): 

(a) If^„ = 1 /n, then lim,,^ = 0; the range is infinite, and the sequence 
is bounded. 

(b) If s„ = n 2 , the sequence {s n } is unbounded, is divergent, and has 
infinite range. 

(c) If s n = 1 + [(— 1 the sequence {s n } converges to 1, is bounded, 
and has infinite range. 

(d) If s n = i", the sequence {.?„} is divergent, is bounded, and has finite 
range. 

(e) If s n = 1 (n = 1, 2, 3, . . .), then {.?„} converges to 1, is bounded, and 
has finite range. 

We now summarize some important properties of convergent sequences 
in metric spaces. 


3.2 Theorem Let {/?„} be a sequence in a metric space X. 

{a) {/>„} converges to p e X if and only if every neighborhood o f p contains 
all but finitely many of the terms of {/?„}. 

( b ) If p e X , p' e X, and if{p n } converges to p and to p\ then p ' = p. 

( c ) If {Pn) converges , then { p n } is bounded. 

(< d ) If E a X and if p is a limit point of E , then there is a sequence { p n } in E 
such that p = lim p n . 

n-» oo 

Proof (a) Suppose p n -*p and let V be a neighborhood of p. For 
some e > 0, the conditions d(q y p) < £, q e X imply q e V. Correspond- 
ing to this £, there exists N such that n > N implies d(p n ,/?)<£. Thus 
n >N implies p n e V. 

Conversely, suppose every neighborhood of p contains all but 
finitely many of the p n . Fix e > 0, and let V be the set of all q e X such 
that d(p. q) < c. By assumption, there exists N (corresponding to this V) 
such that p n eV if n>N. Thus d(p n9 p)<e if n>N ; hence p n ^p. 
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( b ) Let e > 0 be given. There exist integers A, N' such that 

s 

n > N implies d(p n ,/?)<-, 

£ 

n>N' implies d(p n ,p')<-- 

Hence if n > max ( N , A'), we have 

d(p, p ) < d(p, p„) + d(p„ , p ) < e. 

Since c was arbitrary, we conclude that d(p, p') = 0. 

(c) Suppose p n -+p. There is an integer A such that n>N 
implies d(p n , p) < 1 . Put 

r = max (1 , d(p { d(p N , p)}. 

Then d(p n , p) < r for n = 1 , 2, 3, . . . . 

(d) For each positive integer n , there is a point p n e E such that 
d(p n *P) < 1/tf. Given £ > 0, choose A so that Ate > 1. If n > A, it 
follows that d(p n , p) < £. Hence p n -* p. 

This completes the proof. 

For sequences in we can study the relation between convergence, on 
* the one hand, and the algebraic operations on the other. We first consider 
s sequences of complex numbers. 

^3.3 Theorem Suppose {j n }, {/„} are complex sequences , and lim n _ >00 s n — s, 
1 <n = >■ Then 

(a) lim(i„ + /„) = 5 + ?; 

n —* oo 

(b) lim cs„ = cs, lim (c + 5„) = c -f s, for any number c\ 

n-* oo n-* oo 

(r) lim s„t n = st; 

n~* oo 

(d) lim — = - , provided s n / 0 (n = 1 , 2, 3, . . .), and 5/0. 

n-oo S n S 

Proof 

(a) Given £ > 0, there exist integers A 2 such that 

g 

n>N { implies |5 n -s|<“, 

g 

n>N 2 implies \t n — t\<-- 
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If N = max (jVj, 7V 2 ), then n > N implies 

Ik + O ~(S + 01 < |J„ - S\ + |/„ - 0 < E. 

This proves (a). The proof of ( b ) is trivial. 

(c) We use the identity 

(1) V. -St = (j„ - s)(t n - 0 + s(t„ - t) + t(s„ - s). 

Given e > 0, there are integers , N 2 such that 

n>N v implies |^ n — s\ < N / e, 
n > N 2 implies | — / 1 < y/e. 

If we take N = max (N l9 N 2 ), n > N implies 

IC*„ -•*)('„- 01 <e, 

so that 

lim (s n - s)(t n -0 = 0. 

n-+ oo 

We now apply (a) and ( b ) to (1), and conclude that 

lim ( s„t n - st) = 0. 

n-> oo 

( d ) Choosing m such that \s n — s\ < if n >m, we see that 

kl>iM ( n>m ). 


Given e > 0, there is an integer N > m such that n> N implies 

k -A < iM 2 e. 


Hence, for n> N, 



Sn -s\ < e. 


3.4 Theorem 

(a) Suppose x n e R k (n = 1 , 2, 3, . . .) and 

= («i, «kj- 

Then {x„} converges to x = (a! , . . . , a*) if and only if 
(2) lima y „ = a y (l<y<Ar). 


n-* oo 
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(b) Suppose {x n }, {y„} are sequences in R k , {/?„} is a sequence of real numbers , 
and x„ -► x, y„ -► y, /?„ -* /?. Then 

lim (x„ + y„) = x + y, lim x„ • y„ = x • y, lim /?„x„ = fix. 

n-* oo n-*oo n-*o o 

Proof 

(a) If x n — > x, the inequalities 


\ctj,„-ctj\ < |x„ - x|, 

which follow immediately from the definition of the norm in R k , show that 
(2) holds. 

Conversely, if (2) holds, then to each e > 0 there corresponds an 
integer N such that n > TV implies 


K\« ~ a j\ 

s /k 


Hence n > N implies 


k \l/2 

|x„ - x| = { £ -a/ 2 ' 


j= 1 


I 


< 


so that x„ -> x. This proves (a). 

Part ( b ) follows from (a) and Theorem 3.3. 


SUBSEQUENCES 


3.5 Definition Given a sequence {/?„}, consider a sequence {n k } of positive 
integers, such that n x < n 2 < n 3 < • • • . Then the sequence {/?„.} is called a 
subsequence of {/?„}. If {p n } converges, its limit is called a subsequential limit 

of {/>„}• 

It is clear that {p n } converges to p if and only if every subsequence of 
{/?„} converges to p. We leave the details of the proof to the reader. 


3.6 Theorem 

(a) If {p n } is a sequence in a compact metric space X , then some sub- 
sequence of {p n } converges to a point of X. 

( b ) Every bounded sequence in R k contains a convergent subsequence. 
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Proof 

(a) Let £ be the range of {/?„}. If £ is finite then there is a p e E and a 
sequence {w f } with n { < n 2 < n 3 < • • • , such that 

Pni Pt\2 P m 

The subsequence {p n .} so obtained converges evidently to p. 

If £ is infinite, Theorem 2.37 snows that £ has a limit point p e X. 
Choose n { so that d(p,p ni ) < 1. Having chosen we see from 

Theorem 2.20 that there is an integer n, > such that d(p,p n ) < ) //. 
Then {/?„.} converges to p . 

( b ) This follows from ( a ), since Theorem 2.41 implies that every bounded 
subset of R k lies in a compact subset of R k . 

3.7 Theorem The subsequent ial limits of a sequence {p n } in a metric space X 
form a closed subset of X. 

Proof Let £* be the set of all subsequential limits of {p n } and let q be a 
limit point of £*. We have to show that q e £*. 

Choose n x so that p ni #<7. (If no such n x exists, then E* has only 
one point, and there is nothing to prove.) Put S = d(q, p„ x ). Suppose 

n l9 are chosen. Since q is a limit point of £*, there is an x e £* 

with d(x, q) < 2~'d. Since x e E*, there is an n i >n i ^ x such that 
d(x,pj <2~ i S. Thus 

d(q 9 p n .) < 2 1 - , <5 

for / = 1 , 2, 3, . . . . This says that { p n t ) converges to q. Hence q e £*. 


CAUCHY SEQUENCES 

3.8 Definition A sequence {/?„} in a metric space X is said to be a Cauchy 
sequence if for every e > 0 there is an integer N such that d(p n , p m ) < e if n > N 
and m>N. 

In our discussion of Cauchy sequences, as well as in other situations 
which will arise later, the following geometric concept will be useful. 

3.9 Definition Let £ be a subset of a metric space X , and let S be the set of 
all real numbers of the form d(p, q ), with p e E and q e E. The sup of S is 
called the diameter of £. 
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] f{p n } is a sequence in X and if E N consists of the points p N , p N + { , p N + 2 , 
it is clear from the two preceding definitions that { p n } is a Cauchy sequence 
if and only if 

lim diam E N = 0. 

N-oo 

3.10 Theorem 

(a) If E is the closure of a set E in a metric space X , then 

diam E = diam E. 


(b) If K n is a sequence of compact sets in X such that K n z> K n+l 
(n = 1 , 2, 3, . . .) and if 

lim diam K n = 0, 

n-* oo 

then P| fK n consists of exactly one point. 

Proof 


(a) Since E a £, it is clear that 

diam E < diam E. 

Fix s > 0, and choose p e £, q e E. By the definition of £, there are 
points p\ q', in E such that d(p, p) < e, d(q , q) < e. Hence 

d(p, q) < d(p , p) + d(p q') + d(q', q) 

< 2e 4- d(p\ q) < 2e 4 diam E. 

It follows that 

diam E < 2e -f diam E, 


and since s was arbitrary, (a) is proved. 

( b ) Put K = P| ?K n . By Theorem 2.36, K is not empty. If K contains 
more than one point, then diam K > 0. But for each n , K n K , so that 
diam K n > diam K. This contradicts the assumption that diam K n -►(). 


3.11 Theorem 

(a) In any metric space X , every convergent sequence is a Cauchy sequence . 

(b) If X is a compact metric space and if {p n } is a Cauchy sequence in X , 
then {/?„} converges to some point of X. 

(c) In R k , every Cauchy sequence converges . 

Note: The difference between the definition of convergence and 
the definition of a Cauchy sequence is that the limit is explicitly involved 
in the former, but not in the latter. Thus Theorem 3.1 1(6) may enable us 
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to decide whether or not a given sequence converges without knowledge 
of the limit to which it may converge. 

The fact (contained in Theorem 3.11) that a sequence converges in 
R k if and only if it is a Cauchy sequence is usually called the Cauchy 
criterion for convergence. 

Proof 

(a) If and if s > 0, there is an integer TV such that d(p,p n ) < e 

for all n > TV. Hence 

d{Pn . Pm) < d(p„ , p) + d(p, p m ) < 2 E 

as soon as n > TV and m > N. Thus {/>„} is a Cauchy sequence. 

( b ) Let {p„} be a Cauchy sequence in the compact space X. For 
TV = 1 , 2, 3, . . . , let E h be the set consisting of p N , p N + 1 , p N + 2 » • • • • 
Then 

(3) lim diam E N = 0, 

N - >00 

by Definition 3.9 and Theorem 3.10 (a). Being a closed subset of the 
compact space X , each E N is compact (Theorem 2.35). Also E N => £ N+1 , 
so that E n zd E N+l . 

Theorem 3.10(6) shows now that there is a unique p e X which lies 
in every E N . 

Let e > 0 be given. By (3) there is an integer TV 0 such that 
diam E N < e if N>N 0 . Since p e E N , it follows that d(p,q)<s for 
every q e E N , hence for every qeE N . In other words, d(p,p n )<e if 
n > N 0 . This says precisely that p n -+ p. 

(c) Let {jc„} be a Cauchy sequence in R k . Define E N as in (6), with x, 
in place of p t . For some TV, diam E N < 1 . The range of (x n ) is the union 
of E n and the finite set (x,, . . . , x N _j}. Hence {x n } is bounded. Since 
every bounded subset of R k has compact closure in R k (Theorem 2.41), 
(c) follows from (6). 

3.12 Definition A metric space in which every Cauchy sequence converges is 
said to be complete. 

Thus Theorem 3.1 1 says that all compact metric spaces and all Euclidean 
spaces are complete. Theorem 3.11 implies also that every closed subset E of a 
complete metric space X is complete. (Every Cauchy sequence in £ is a Cauchy 
sequence in X , hence it converges to some p e X, and actually p e E since £ is 
closed.) An example of a metric space which is not complete is the space of all 
rational numbers, with d(x 9 y) = \x — y\ . 
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Theorem 3.2(c) and example ( d ) of Definition 3.1 show that convergent 
sequences are bounded, but that bounded sequences in R k need not converge. 
However, there is one important case in which convergence is equivalent to 
boundedness; this happens for monotonic sequences in R x . 

3.13 Definition A sequence {j n } of real numbers is said to be 

(a) monotonically increasing if s n < ^ n+1 (n = 1, 2, 3, . . .); 

( b ) monotonically decreasing if s n > ^ n + 1 (n = 1 , 2, 3, . . .). 

The class of monotonic sequences consists of the increasing and the 
decreasing sequences. 

3.14 Theorem Suppose {.?„} is monotonic. Then {^ n } converges if and only if it 
is bounded. 


Proof Suppose s n < .y n+1 (the proof is analogous in the other case). 
Let E be the range of {$„}. If {s n } is bounded, let s be the least upper 
bound of E. Then 

s n £s (« = 1,2, 3, ...)• 

For every e > 0, there is an integer N such that 

s - e < s N < s, 

for otherwise s — e would be an upper bound of E. Since {.?„} increases, 
n> N therefore implies 


S - £ <S n <S, 

which shows that {.?„} converges (to s). 

The converse follows from Theorem 3.2(c). 


UPPER AND LOWER LIMITS 

3.15 Definition Let {^ n } be a sequence of real numbers with the following 
property: For every real M there is an integer N such that n>N implies 
> M. We then write 

+oo. 

Similarly, if for every real M there is an integer N such that n>N implies 
s n < M, we write 
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It should be noted that we now use the symbol -► (introduced in Defini- 
tion 3.1) for certain types of divergent sequences, as well as for convergent 
sequences, but that the definitions of convergence and of limit, given in Defini- 
tion 3.1, are in no way changed. 

3.16 Definition Let {$„} be a sequence of real numbers. Let E be the set of 
numbers x (in the extended real number system) such that s nk -► x for some 
subsequence {j n J. This set E contains all subsequential limits as defined in 
Definition 3.5, plus possibly the numbers +oo, — oo. 

We now recall Definitions 1.8 and 1.23 and put 

s* = sup £, 
s+ = inf E. 

The numbers j*, s+ are called the upper and lower limits of {+„}; we use the 
notation 

lim sup s n = s *, lim inf s n = s *. 

fl-M 30 f|-M» 

3.17 Theorem Let {s n } be a sequence of real numbers . Let E and s* have the 
same meaning as in Definition 3.16. Then s* has the following two properties: 

(a) s* e E. 

(b) If x > s*, there is an integer N such that n > N implies s n < x. 
Moreover , s* is the only number with the properties ( a ) and (b). 

Of course, an analogous result is true for s+. 

Proof 

( a ) If s* = +oo, then E is not bounded above; hence {+„} is not bounded 
above, and there is a subsequence {> n J such that s nk -+ + oo. 

If s* is real, then E is bounded above, and at least one subsequential 
limit exists, so that {a) follows from Theorems 3.7 and 2.28. 

If s* = —oo, then E contains only one element, namely — oo, and 
there is no subsequential limit. Hence, for any real M, s n > M for at 
most a finite number of values of n, so that s n -> — oo. 

This establishes (a) in all cases. 

( b ) Suppose there is a number x > s* such that s n >x for infinitely 
many values of n. In that case, there is a number y e E such that 
y > x > s* 9 contradicting the definition of s*. 

Thus s* satisfies (a) and ( b ). 

To show the uniqueness, suppose there are two numbers, p and q , 
which satisfy ( a ) and (6), and suppose p <q. Choose x such that p <x <q. 
Since p satisfies (6), we have s n < x for n > N. But then q cannot satisfy (a). 
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3.18 Examples 

(а) Let {s n } be a sequence containing all rationals. Then every real 
number is a subsequential limit, and 

lim sup s n = + oo, lim inf s n = — oo.' 

n -►-oe n -*■ oo 

(б) Let = ( - r/[l + (l fn)]. Then 

lim sup s n = 1 , lim inf j n = — 1 . 

n-> oo «-► oo 

(c) For a real-valued sequence {.?„}, lim s n = s if and only if 

n~* cc 

lim sup s n = lim inf s n = s. 

n-> oc n-> oo 

We close this section with a theorem which is useful, and whose proof is 
quite trivial: 

3.19 Theorem If s n <t n for n > N, where N is fixed . then 

lim inf s n < lim inf t n , 

n — ► oo n~* oc 

lim sup s n < lim sup t n . 

n->oc n~* oo 


SOME SPECIAL SEQUENCES 

We shall now compute the limits of some sequences which occur frequently. 
The proofs will all be based on the following remark: If 0 < x n < s n for n > JV, 
where N is some fixed number, and if s„ ->0, then x n ->0. 




58 PRINCIPLES OF MATHEMATICAL ANALYSIS 


Proof 

(a) Take n > (l/e) 1/J \ (Note that the archimedean property of the real 
number system is used here.) 

( b ) If p>\, put x n = Z] p — 1. Then *„>(), and, by the binomial 
theorem, 

1 +nx n £( 1 +x H ) n =p i 

so that 


0<x n <, 


P-1 

n 


Hence x„ -+ 0. If p = 1 , (b) is trivial, and if 0 < p < 1 , the result is obtained 
by taking reciprocals. 

( c ) Put x n = f/n — 1. Then x n ^ 0, and, by the binomial theorem, 


n = (1 4- xj > 


0 


Hence 


n — 1 


(n > 2). 


(< d ) Let A: be an integer such that A: > a, A: > 0. For w > 2A:, 


a +p) n > ® p " = 


n(n - 1) ••• (n -k + 1) . 

Ti p 


n k p k 

> ¥id' 


Hence 

0 < — < — (n > 2k). 

(1 +P) n P k 

Since a — k < 0, if~ k ->0, by (a). 

(e) Take a = 0 in (d). 


SERIES 

In the remainder of this chapter, all sequences and series under consideration 
will be complex-valued, unless the contrary is explicitly stated. Extensions of 
some of the theorems which follow, to series with terms in R k , are mentioned 
in Exercise 15. 
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3.21 Definition Given a sequence {a H } 9 we use the notation 

(p ^ q) 

" m P 

to denote the sum a p + + • • • + a q . With { a n } we associate a sequence 

fo}, where 

H 

s» = L a k- 

k= 1 

For {$„} we also use the symbolic expression 

CL\ 4 * &2 4 * ^3 4 * * * * 

'Or, more concisely, 

( 4 ) fa n . 

n- 1 

The symbol (4) we call an infinite series , or just a series. The numbers 
s n are called the partial sums of the series. If {$„} converges to s , we say that 
the series converges , and write 

00 

L 

n = 1 

The number s is called the sum of the series; but it should be clearly under- 
stood that s is the limit of a sequence of sums , and is not obtained simply by 
addition. 

If { s n } diverges, the series is said to diverge. 

Sometimes, for convenience of notation, we shall consider series of the 

form 

(5) ta n . 

n= 0 

And frequently, when there is no possible ambiguity, or when the distinction 
is immaterial, we shall simply write 2a n in place of (4) or (5). 

It is clear that every theorem about sequences can be stated in terms of 
series (putting a x = s u and a n = s n — s n - l for n > 1), and vice versa. But it is 
nevertheless useful to consider both concepts. 

The Cauchy criterion (Theorem 3.11) can be restated in the following 

form: 

3.22 Theorem converges if and only if for every e > 0 there is an integer 
N such that 

m 

<e 

k = n 


( 6 ) 

if m ^rr^N. 
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In particular, by taking m = n, (6) becomes 
|a„|^e (n > N). 

In other words: 

3.23 Theorem If^a n converges , then lim n _ oc a n = 0. 

The condition is not, however, sufficient to ensure convergence 

of y La n . For instance, the series 


diverges; for the proof we refer to Theorem 3.28. 

Theorem 3.14, concerning monotonic sequences, also has an immediate 
counterpart for series. 

3.24 Theorem A series of nonnegative 1 terms converges if and only if its 
partial sums form a bounded sequence. 

We now turn to a convergence test of a different nature, the so-called 
“comparison test.” 

3.25 Theorem 

(a) If | a n | < c n for n> N 0i where N 0 is some fixed integer , and if I c n 
converges , then lLa n converges . 

( b ) If a n >d n > 0 for n > N 0 , and ifl.d n diverges , then I<a n diverges. 

Note that (b) applies only to series of nonnegative terms a n . 

Proof Given e > 0, there exists N > N 0 such that m > n > N implies 

m 

Z c k ^ e > 

k = n 

by the Cauchy criterion. Hence 

ta k <fkJ<Z^<e, 

k. = n k = n k = n 

and (a) follows. 

Next, (b) follows from (a), for if *La n converges, so must Y.d n [note 
that (b) also follows from Theorem 3.24]. 


The expression “ nonnegative” always refers to real numbers. 
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The comparison test is a very useful one; to use it efficiently, we have to 
become familiar with a number of series of nonnegative terms whose conver- 
gence or divergence is known. 


SERIES OF NONNEGATIVE TERMS 

The simplest of all is perhaps the geometric series. 

3.26 Theorem If 0 < x < 1, then 


If x > 1, the series diverges . 


Proof If jc = 5 ^ 1 , 


s„=i **= 

k= 0 


1 -X n+1 
1 -X ‘ 


The result follows if we let n oo. For x = 1, we get 

1 + 1 + 1 + ••• , 

which evidently diverges. 


In many cases which occur in applications, the terms of the series decrease 
monotonically. The following theorem of Cauchy is therefore of particular 
interest. The striking feature of the theorem is that a rather “thin’’ subsequence 
of { a n } determines the convergence or divergence of I a n . 


3.27 Theorem Suppose a x > a 2 > a 3 > • • • > 0. Then the series t a n con- 
verges if and only if the series 

00 

(7) £ 2 k a 2 k = + 2a 2 + 4 4- • • • 

k = o 

converges. 

Proof By Theorem 3.24, it suffices to consider boundedness of the 
partial sums. Let 


s n = <*i +*2 + + *„> 

t k = Q i + 2a 2 + * * * + • 
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For n < 2 k , 


s n < a i + (<?2 T # 3) + * * * 4 - (Pz k + * * * 4 - ^2 k + 1 - 1) 
< Q\ + 2(72 + * ' * H" 2*#2 k 
= 


so that 

(8) s n <t k . 

On the other hand, if n > 2\ 

s n '> + a 2 + (#3 4 (74) 4 - * * * 4 - (# 2 k_ 1 + 1 4 - * * * 4 - (?2 k ) 
— "2^1 4 * ^2 “h 2^4 + * * * 4 - 2 * 1 (?2k 
= i^k» 


so that 


(9) 2^>r fc . 

By (8) and (9), the sequences {^„} and {t k } are either both bounded 
or both unbounded. This completes the proof. 


3.28 


Theorem V — converges if p > 1 and diverges if p < 1 . 


Proof If p < 0, divergence follows from Theorem 3.23. If p > 0, 
Theorem 3.27 is applicable, and we are led to the series 


V 9* = 

k W 2 kp 


I 2‘* 


~P)k 


Now, 2 1 _p < 1 if and only if 1 — p < 0, and the result follows by com- 
parison with the geometric series (take x = 2 1-p in Theorem 3.26). 

As a further application of Theorem 3.27, we prove: 


3.29 Theorem If p > 1 , 


( 10 ) 


0 ° ] 

y — - — 

w = 2 «(!og n) p 


converges ; if p < 1, the series diverges - 


Remark “log n ” denotes the logarithm of n to the base e (compare Exercise 7, 
Chap. 1); the number e will be defined in a moment (see Definition 3.30). We 
let the series start with n = 2, since log 1=0. 
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Proof The monotonicity of the logarithmic function (which will be 
discussed in more detail in Chap. 8) implies that {log n } increases. Hence 
{1/fllogrt} decreases, and we can apply Theorem 3.27 to (10); this 
leads us to the series 

00 1 00 J 1 00 1 

*? 1 2 ’ 2*(log 2 k Y = (k log 2) p = (log 2)'’*?, ’ 

and Theorem 3.29 follows from Theorem 3.28. 


This procedure may evidently be continued. For instance, 

® 1 


( 12 ) 

diverges, whereas 
(13) 


r»= 3 n log n log log n 


1 


n=3 n log A?(log log n) 2 


converges. 

We may now observe that the terms of the series (12) differ very little 
from those of (13). Still, one diverges, the other converges. If we continue the 
process which led us from Theorem 3.28 to Theorem 3.29, and then to (12) and 
(13), we get pairs of convergent and divergent series whose terms differ even 
less than those of (12) and (13). One might thus be led to the conjecture that 
there is a limiting situation of some sort, a “boundary” with all convergent 
series on one side, all divergent series on the other side — at least as far as series 
with monotonic coefficients are concerned. This notion of “boundary” is of 
course quite vague. The point we wish to make is this : No matter how we make 
this notion precise, the conjecture is false. Exercises 11(6) and 12(6) may serve 
as illustrations. 

We do not wish to go any deeper into this aspect of convergence theory, 
and refer the reader to Knopp’s “Theory and Application of Infinite Series,” 
Chap. IX, particularly Sec. 41. 


THE NUMBER e 


o° l 

3.30 Definition e = Y — -• 
n = 0 n\ 


Here Ai! = l*2*3***Aiiffl>l, and 0! = 1 . 
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1 1 1 

s n= 1 + 1 + "j T + ^ — 7— J + * * * + : — 

1*2 1 • 2 • 3 1 • 2 ••• /I 

+ 1 + - + ^2 + * * * + < 

the series converges, and the definition makes sense. In fact, the series converges 
very rapidly and allows us to compute e with great accuracy. 

It is of interest to note that e can also be defined by means of another 
limit process; the proof provides a good illustration of operations with limits: 


3.31 Theorem lim 


Proof Let 


-H)" 


By the binomial theorem, 


tn = 1 + 1 + 




Hence t n <s n , so that 


lim sup t n < e , 


by Theorem 3.19. Next, if n > m. 




Let n 00 , keeping m ’fixed. We get 


so that 


lim inf /„ > 1 +1 “h ttt “h “ * “ H -> 

it-* v. 2! ml 


s m < lim inf t n . 


Letting m -> 00 , we finally get 

e < lim inf t n . 

n~* 00 

The theorem follows from (14) and (15). 


( 15 ) 
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The rapidity with which the series £ — 


converges can be estimated as 


follows: If s n has the same meaning as above, we have 


so that 
(16) 


(« + l)! + (« + 2)! + (/i + 3)! + "‘ 

1 I 1 1 J_J_ 

< (« + l)!\ + « + 1 + (« + l) 2 + ) n\n 


Thus .y 10 , for instance, approximates e with an error less than 10 -7 . The 
inequality (16) is of theoretical interest as well, since it enables us to prove the 
irrationality of e very easily. 


3.32 Theorem e is irrational. 

Proof Suppose e is rational. Then e = plq , where p and q are positive 
integers. By (16), 

(17) 0 < q\(e — j ) < - • 

<7 

By our assumption, q\e is an integer. Since 

«' J ,-,!(l+l+j j + - + i) 

is an integer, we see that q\(e — s q ) is an integer. 

Since q > 1, (17) implies the existence of an integer between 0 and 1. 
We have thus reached a contradiction. 

Actually, e is not even an algebraic number. For a simple proof of this, 
see page 25 of Niven’s book, or page 176 of Herstein’s, cited in the Bibliography. 


THE ROOT AND RATIO TESTS 

3.33 Theorem (Root Test) Given I a n , put a = lim sup 1 a n | . 

n -+ oo 

Then 

(a) if cl < 1, Za n converges ; 

(b) if cl > 1 , I a n diverges; 

( c ) if cl = 1 , the test gives no information. 



66 PRINCIPLES OF MATHEMATICAL ANALYSIS 


Proof If a < 1, wc can choose p so that a < p < 1, and an integer N 
such that 


yki <p 

for n ^ N [by Theorem 3.17(£)]. That is, n > N implies 

kl <P- 

Since 0 < P < 1, Z/P converges. Convergence of Ya n follows now from 
the comparison test. 

If a > 1, then, again by Theorem 3.17, there is a sequence {n k } such 

that 


Vki -*«• 


Hence |a n | >1 for infinitely many values of n, so that the condition 
o n — ► 0, necessary for convergence of Ln n , does not hold (Theorem 3.23). 
To prove (c), we consider the series 


Ip- 


For each of these series a = 1, but the first diverges, the second converges. 


3.34 Theorem (Ratio Test) The series 

k+n 


(a) converges //lim sup 


< 1, 


(b) diverges if 


■*»+ i 


> 1 for n >n 0 , where n 0 is some fixed integer. 


Proof If condition (a) holds, we can find p < 1, and an integer N, such 
that 


'ii+i 


<P 


for n^N. In particular, 

\ a N+l I < P\ q n\ y 

\<*N + 2 I < Pl a N + l I < P 2 l a tfl > 


l fl N + pl < P P \ a N \ • 



NUMERICAL SEQUENCES AND SERIES 67 


That is, 

k.l < Kl 

for n >N, and (a) follows from the comparison test, since I/?" converges. 

If I a n + 1 1 ^ I a,* | for n ^ /I o > it is easily seen that the condition a n -> 0 
does not hold, and ( b ) follows. 

Note: The knowledge that lim a n +Ja n = 1 implies nothing about the 
convergence of Ea n . The series El//? and El//? 2 demonstrate this. 


3.35 Examples 

(a) Consider the series 

11111111 
2 + 3 + 2 2 + 3 2 + 2 3 + 3 3 + 2 4 + 3 4 + ’ 

for which 



The root test indicates convergence; the ratio test does not apply. 
(b) The same is true for the series 


1 1 1 1 i 1 1 

2 +1+ 8 + 4 + 32 + 16 + l28 + 64 + 


where 


but 


lim inf = - , 

n~* ao O n 8 

lim sup = 2, 

n— oo a n 

lim^/a, = f 
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3.36 Remarks The ratio test is frequently easier to apply than the root test, 
since it is usually easier to compute ratios than «th roots. However, the root 
test has wider scope. More precisely: Whenever ihe ratio test shows conver- 
gence, the root test does too; whenever the root test is inconclusive, the ratio 
test is too. This is a consequence of Theorem 3.37, and is illustrated by the 
above examples. 

Neither of the two tests is subtle with regard to divergence. Both deduce 
divergence from the fact that a n does not tend to zero as n oo. 

3.37 Theorem For any sequence {c n } of positive numbers , 

lim inf < ii m j n f /*/c n , 

"-* 00 C n n-> oo 

lim sup c n < lim sup-^- 1 • 

n-+ o o n-*oo ^ n 

Proof We shall prove the second inequality; the proof of the first is 
quite similar. Put 


Q 

a = lim sup- 2 -^- 

it-* oo C n 

If a = +oo, there is nothing to prove. If a is finite, choose p > a. There 
is an integer N such that 


— <j8 

for n>N. In particular, for any p > 0, 

c N+k+l < 0c N+k (k = 0, 1 p — 1). 

Multiplying these inequalities, we obtain 

c n + p ^ P Pc n > 
or 


Hence 


so that 


c „<c N r N -r 


yc n <t/c N r N '-P> 


lim sup ^/c n < ft, 

n~* oo 


( 18 ) 
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by Theorem 3.20(6). Since (18) is true for every p > a, we have 

lim sup ffc n < a. 

n~* oc 


POWER SERIES 

3.38 Definition Given a sequence {c„} of complex numbers, the series 

( 19 ) tc n z” 

n = 0 

is called a power series. The numbers c n are called the coefficients of the series; 
z is a complex number. 

In general, the series will converge or diverge, depending on the choice 
of z. More specifically, with every power series there is associated a circle, the 
circle of convergence, such that H9) converges if z is in the interior of the circle 
and diverges if z is in the exteiior (to cover all cases, we have to consider the 
plane as the interior of a circle of infinite radius, and a point as a circle of radius 
zero). The behavior on the circle of convergence is much more varied and can- 
not be described so simply. 

3.39 Theorem Given the power series Z.c n z n , put 

a = lim sup - R = ~' 

n~* oo 

(If (X = 0, R = + oo; if a = +oo, R = 0.) Then Zc n z n converges if |z| < R , and 
diverges if \z\ > R. 

Proof Put a n = c n z n , and apply the root test : 

lim sup \a„\ = |z| limsup</|c„| = 

n-+ oo n -* oo 

Note: R is called the radius of convergence of z". 

3.40 Examples 

(a) The series Zn n z n has R = 0. 

z n 

(b) The series Y — - has R = -f oo. (In this case the ratio test is easier to 

^n\ 

apply than the root test.) 
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(c) The series Lz" has R = 1. If |z| = 1, the series diverges, since {z n } 

does not tend to 0 as n -> oo. 

z n 

(< d ) The series £ — has R = 1. It diverges if z = 1. It converges for all 
n 

other z with |z| =1. (The last assertion will be proved in Theorem 3.44.) 
z n 

( e ) The series has R = 1. It converges for all z with |z| = 1, by 
n 

the comparison test, since \z n /n 2 \ = 1 /n 2 . 


SUMMATION BY PARTS 

3.41 Theorem Given two sequences {<?„}, {&„}, put 

4=i«t 

k = 0 

if n> 0; put A-! = 0. Then , if 0 < p <q, we have 

(20) X - bn+ l) + ^ A - A p-lbp- 

n = p n = p 

Proof 

X>A = t(A„- A n - l )b n = i A n b n - *£ 

n = p n = p n = p «=p-l 

and the last expression on the right is clearly equal to the right side of 

( 20 ). 

Formula (20), the so-called “partial summation formula,” is useful in the 
investigation of series of the form Ta n b n , particularly when {b n } is monotonic. 
We shall now give applications. 


3.42 Theorem Suppose 

(a) the partial sums A n ofTa n form a bounded sequence; 

(b) b 0 >b 1 >b 2 >---; 

(< c ) lim b n = 0. 

n-+ oo 


Then 'La n b n converges. 
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Proof Choose M such that | A n \ < M for all n. Given e > 0, there is an 
integer N such that b N < (e/2M). For TV < p < q, we have 

Z a « b n = Z A n(b„ ~ b n + l) + A q b q ~ A p-\ b p 

n = p n = p 

q- 1 

— M ^ ( b n — b n + 1 ) + bq + b p 

n = p 

= 2Mb p < 2 Mb N < e. 

Convergence now follows from the Cauchy criterion. We note that’ the 
first inequality in the above chain depends of course on the fact that 

bn-bn + l >0. 

3.43 Theorem Suppose 

(a) |c,| > |c 2 | > \c 3 \ >••• ; 

(b) c 2m -t > 0, c 2m < 0 (m= 1,2, 3, ...); 

(c) lim^^ c„ = 0. 

Then converges. 


Series for which ( b ) holds are called “alternating series”; the theorem was 
known to Leibnitz. 

Proof Apply Theorem 3.42, with a n = ( — 1 ) n + 1 , b n = | c n | . 


3.44 Theorem Suppose the radius of convergence of z n is 1 , and suppose 
c 0 > Cj > c 2 > * • * , lim,,.^ c n = 0. Then 2.c n z n converges at every point on the 
circle \z\ = 1 , except possibly at z = 1 . 

Proof Put a n = z", b n = c n . The hypotheses of Theorem 3.42 are then 
satisfied, since 


\ A »\ 


n 


1 -z m+l 

Zz m 

= 

1 -z 

m— 0 



if \z\ =1,2^1. 


ABSOLUTE CONVERGENCE 

The series Za n is said to converge absolutely if the series E | a„ \ converges. 

3.45 Theorem If Y.a n converges absolutely, then La„ converges. 
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Proof The assertion follows from the inequality 

m m 

X>* ^ X 1^1. 

k=n n=k 

plus the Cauchy criterion. 

3.46 Remarks For series of positive terms, absolute convergence is the same 
as convergence. 

If la n converges, but diverges, we say that la n converges non- 

absolutely. For instance, the series 

y izll 

^ n 

converges nonabsolutely (Theorem 3.43). 

The comparison test, as well as the root and ratio tests, is really a test for 
absolute convergence, and therefore cannot give any information about non- 
absolutely convergent series. Summation by parts can sometimes be used to 
handle the latter. In particular, power series converge absolutely in the interior 
of the circle of convergence. 

We shall see that we may operate with absolutely convergent series very 
much as with finite sums. We may multiply them term by term and we may 
change the order in which the additions are carried out, without affecting the 
sum of the series. But for nonabsolutely convergent series this is no longer true, 
and more care has to be taken when dealing with them. 


ADDITION AND MULTIPLICATION OF SERIES 

3.47 Theorem If lLa n = A , and = B, then + b n ) = A + B, and 
= cA,for any fixed c. 

Proof Let 

n n 

k = 0 k = 0 

Then 

n 

A n -f B n = Yj ( a k + W- 

k = 0 

Since lim n _ 00 A n = A and lim n _ 00 B n = B , we see that 
lim ( A n 4- B n ) = A + B. 

n~* oo 

The proof of the second assertion is even simpler. 
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Thus two convergent series may be added term by term, and the result- 
i ing series converges to the sum of the two series. The situation becomes more 
(complicated when we consider multiplication of two series. To begin with, we 
lhave to define the product. This can be done in several ways; we shall consider 
Ithe so-called “Cauchy product.” 

^ 3.48 Definition Given Ztf„ and Z6„ , we put 

n 

Cn='L a k b n-k (« = 0,1,2,...) 
k = 0 

i and call Z c n the product of the two given series. 

This definition may be motivated as follows. If we take two power 
* series Z a n z n and Z b n z n , multiply them term by term, and collect terms contain- 
i ing the same power of z, we get 

00 00 

I a n z n - X b„z n = (a 0 + a { z + a 2 z 2 + ■ ■ -)(b 0 + b 2 z + b 2 z 2 +•••) 

n = 0 n = 0 

= Qo bo "h (<*o b\ "I - <*ib 0 )z -f- (oq b 2 + + o 2 bo)z 2 + 

= c 0 -f c Y z -f c 2 z 2 + ••• . 

' Setting z = 1 , we arrive at the above definition. 

3.49 Example If 

n n n 

A n=Y, Q k> B n=Y, b k ' C n = Z C k » 
k= 0 k= 0 k=0 

and A n -> A> B n -+ B, then it is not at all clear that {C n } will converge to AB> 
since we do not have C n = A n B n . The dependence of {C n } on {A n } and {B n } is 
quite a complicated one (see the proof of Theorem 3.50). We shall now show 
that the product of two convergent series may actually diverge. 

The series 


f izlL = 1 __L + _L__L + ... 

.-oV^ + T J2 y/3 y/4 

converges (Theorem 3.43). We form the product of this series with itself and 
obtain 


So c - ■ 1 ■ (72 + 72) + ^ + 7 ^ + 75) 

-(73 


jA y/iy/2 jljl jA 
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so that 


= (-!)" z 


Since 


we have 


*=o y/(n - Jc + \)(k + 1) 
(H-*+l)(*+l) = (?+l) ^(i + 1 ) 


k.l £ 


2 (n + 1) 


k=o n + 2 n + 2 

so that the condition c n -► 0, which is necessary for the convergence of Ec n , is 
not satisfied. 

In view of the next theorem, due to Mertens, we note that we have here 
considered the product of two nonabsolutely convergent series. 

3.50 Theorem Suppose 
00 

(a) £ a n converges absolutely , 

n = 0 

(b) f,a n = A, 

n = 0 

(C) f>. - 


n = 0 


(d) c n =Y,a k b n . k (n = 0,1,2,...). 

k = 0 


Then 


I C. = 

11=0 


That is, the product of two convergent series converges, and to the right 
value, if at least one of the two series converges absolutely. 

Proof Put 

A„ = ta k , B„ = tbk, C„ = £c*, P„ = B n -j}. 

k= 0 *=0 k= 0 


\ + ( fl o^i + fl i^o) + * “ + ( a ob n + a l b n - l + * * * + a n b 0 ) 

= a 0 B n + a l B n _ t + • • • + a n Bo 
= a 0 (B -f P„) -f a Y (B + /?„-i) + * * * + a„(B + /? 0 ) 

= A n B + a 0 p n + + * * * + a n Po 


Then 
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Put 

Jn = a oPn + a lPn-l + * * * + Q n Po • 

We wish to show that C n -*AB. Since A n B-^AB, it suffices to 
show that 

(21) lim y„ = 0. 

oo 

Put 

“= Z \ a n\- 

n = 0 

[It is here that we use (a).] Let e > 0 be given. By (c), P„ -►O. Hence we 
can choose N such that \P„\ < e for n > N, in which case 

\y n \ < \Poa n + * * * + Pn a n-n\ + I Pn + 1 — N - 1 + * “ + Pn a 0 I 
^ I Po a n + “ * + Pn a n-N I + ea * 

Keeping N fixed, and letting n -* oo, we get 
lim sup | y n \ < ea, 

n~* oo 

since a k -►0 as k -► oo. Since e is arbitrary, (21) follows. 

Another question which may be asked is whether the series , if con- 
vergent, must have the sum AB. Abel showed that the answer is in the affirma- 
tive. 

3.51 Theorem If the series ££„, lc n converge to A, B, C, and 

c n = a 0 b n + • “ + a n b 0 , then C = AB . 

Here no assumption is made concerning absolute convergence. We shall 
give a simple proof (which depends on the continuity of power series) after 
Theorem 8.2. 


REARRANGEMENTS 

3.52 Definition Let {k n },n = 1,2,3,..., be a sequence in which every 
positive integer appears once and only once (that is, {i k „} is a 1-1 function from 
J to J, in the notation of Definition 2.2). Putting 

a'n = a kn («= 1 , 2 , 3 ,...), 

we say that La' is a rearrangement of La„ . 
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If {.?„}, {s' n } are the sequences of partial sums of Ztf„, Z*^, it is easily seen 
that, in general, these two sequences consist of entirely different numbers. 
We are thus led to the problem of determining under what conditions all 
rearrangements of a convergent series will converge and whether the sums are 
necessarily the same. 

3.53 Example Consider the convergent series 

(22) . 1 -± + i -* + *-* + ••• 

and one of its rearrangements 

(23) 1 + 3 “ i + i + T~i+i + Tf”6 + “* 

in which two positive terms are always followed by one negative. If s is the 
sum of (22), then 

* < 1 - i + i = l 

Since 


1 1 1 - 0 
4k — 3 + 4k — 1 2k > 

for k > 1, we see that s’ 3 < s' 6 < s' 9 < - •• 9 where s' n is nth partial sum of (23). 
Hence 


lim sup s' n > s' 3 = |, 

n~* oo 

so that (23) certainly does not converge to s [we leave it to the reader to verify 
that (23) does, however, converge]. 

This example illustrates the following theorem, due to Riemann. 


3.54 Theorem Let be a series of real numbers which converges , but not 
absolutely. Suppose 

— oo < a < jS < oo. 


Then there exists a rearrangement with partial sums s' n such that 

(24) lim inf s ' = a, lim sup si, = p. 

n~* co n~* oo 

Proof Let 


Pn = 


|0„l +a n 


q n = 


\^n\ t2 n 


(n = 1, 2, 3,...). 
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Then P„ - q„ = a n , p n + q n = \a n \, p n >0, q n >0. The series !/>„, lq„ 
must both diverge. 

For if both were convergent, then 

£(/>» + ?„) = £ \a H \ 

would converge, contrary to hypothesis. Since 

N N N N 

X «n = X (Pn ~ Vo) = X Pn ~ X » 

n = 1 n = 1 n= 1 n = 1 

divergence of !/?„ and convergence of lq n (or vice versa) implies diver- 
gence of Za n , again contrary to hypothesis. 

Now let P l9 P 2 , ^3 , ... denote the nonnegative terms of , in the 
order in which they occur, and let Q u Q 2 , £? 3 , . . . be the absolute values 
of the negative terms of I,a n , also in their original order. 

The series I Q n differ from I p n , Zg n only by zero terms, and 
are therefore divergent. 

We shall construct sequences {/w n }, {/:„}, such that the series 
(25) Pi + • • • + P mt — Qi — * * • — Q kx + P mi + i + • • * 

+ Pm 2 ~ Qkt+l — ‘ ‘ — Q k2 + ‘ ‘ , 

which clearly is a rearrangement of satisfies (24). 

Choose real-valued sequences {a B }, {/?„} such that a„-MX, /?„-►/?, 
ot n <Pn,P i > 0. 

Let m u k x be the smallest integers such that 
P\ + * mm + P mx > Pi, 

P\+ '+P mi — Q\~ * — Q kx < oq ; 

let m 2 , k 2 be’ the smallest integers such that 

P t + ■ ■ ■ + P m , - Ql - ■ - <2k t + Pm i+ l + ■ + Pm 2 > 02, 

Pi + ' " + P m , ~ Ql ~ ' ' ' ~ Qk\ + ^mi +1 ^ m 2 — ^*i + 1 

- Q * 2 <a 2 ; 

and continue in this way. This is possible since and diverge. 

If x„, y n denote the partial sums of (25) whose last terms are P „ n , 
— Q kn , then 

I P n | < P m n > I yn I — Qk n • 

Since P n -> 0 and -> 0 as a -> oo , we see that -+ P, y n -+ a. 

Finally, it is clear that no number less than a or greater than p can 
be a subsequential limit of the partial sums of (25). 
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3.55 Theorem If^a n is a series of complex numbers which converges absolutely , 
then every rearrangement ofl*a n converges , and the ” all converge to the same sum . 

Proof Let Etf ' be a rearrangement, with partial sums s' n . Given e > 0, 
there exists an integer N such that m >n> N implies 

m 

(26) I\o t \<e. 

i = n 

Now choose p such that the integers 1,2, N are all contained in the 
set &!, k 2 , . . . , k p (we use the notation of Definition 3.52). Then if h > p, 
the numbers a l9 ...,a N will cancel in the difference s n — s' n , so that 
| < e, by (26). Hence {^} converges to the same sum as {s n }. 


EXERCISES 

1. Prove that convergence of { s „ } implies convergence of {|^„|}. Is the converse true? 

2 . Calculate lim (V n 2 + n — n). 

n -* oo 

3. If si = V 2, and 

J- + i = ^2+ (n = 1,2,3,...), 

prove that {^„} converges, and that 2 for < n = 1, 2, 3, . . . . 

4. Find the upper and lower limits of the sequence {^„} defined by 

A Sim - 1 1 

S l — 0 » S2m — ^ > •S’2m + 1 — 2~‘‘ y2m • 


5. For any two real sequences { a „ }, { b n }, prove that 

lim' sup (a„ + b n ) < lim sup a„ + lim sup b n , 


provided the sum on the right is not of the form oo — oo. 

6. Investigate the behavior (convergence or divergence) of 2a n if 


(a) a„ =Vn+ 1 — Vn; 


(b) a n = 


Vn+ 1 — Vn 


(c) a„=(Vn-ir; 

(i d ) a„ = - — - — - , for complex values of z. 

1 “h z 


7. Prove that the convergence of implies the convergence of 



if a„ >. 0. 
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8. If converges, and if { b n } is monotonic and bounded, prove that £ a n b n con- 
verges. 

9. Find the radius of convergence of each of the following power series: 

(a) 2>V, (b) 

2 " n 3 

(c) (d) Ty z "- 

10. Suppose that the coefficients of the power series z" are integers, infinitely many 
of which are distinct from zero. Prove that the radius of convergence is at most 1. 

11. Suppose a„ > 0, s„ = a x + ♦ • • + a„ , and £tf» diverges. 

(a) Prove that £ diverges. 

1 + a n 

( b ) Prove that 


@N + 1 + k > j SN 

Sn + 1 SN + k Sn+K 


and deduce that £ — diverges. 

S n 

(c) Prove that 

Q n ^ i 1 

S n $n — l $n 


and deduce that converges. 

S n 

(d) What can be said about 


a n 


1 + na„ 

12. Suppose a„> 0 and £tf„ converges. Put 


and X 


dn 


1 + n 2 a„ 


(a) Prove that 


I" n — ^ dm . 


d m dn * 

+ * * * -T — >1 

r m r n r m 


if m < n f and deduce that Y — diverges. 

r„ 
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(b) Prove that 


<2(Vr„- W. + 1 ) 

v r„ 


U „ 

and deduce that X “7^ converges. 

V r„ 

13 . Prove that the Cauchy product of two absolutely convergent series converges 
absolutely. 

14 . If {s„} is a complex sequence, define its arithmetic means o„ by 


o n 


So + ^1 + * * * + S n 

n + 1 


(n = 0,1,2,...). 


(a) If lim s„ = s , prove that lim o n = s. 

(b) Construct a sequence {.?„} which does not converge, although lim o„ — 0. 

( c ) Can it happen that s„ > 0 for all n and that lim sup s„ = 00, although lim a n = 0 ? 

(d) Put a„ = s„ — s„-i, for n > 1. Show that 


— o„ 


1 

n+ 1 




Assume that lim ( na „ ) = 0 and that {a„} converges. Prove that {s„} converges. 
[This gives a converse of ( a ), but under the additional assumption that na„-+ 0.] 

( e ) Derive the last conclusion from a weaker hypothesis: Assume M <00, 
\na„ \ <M for all n , and lim o„ = a. Prove that lim s„ = a, by completing the 
following outline: 

If m <n, then 


Sn ~ O n 


m+ 1 
n — m 


(o„ — o m ) 


+ 


1 


n — m 


Z (Sn-Si). 
l=m+ 1 


For these /, 


, , ^ (n - i)M (n — m — 1 )M 

Mn — Si\ < — . ■- < ■ 


/ + 1 ' m + 2 

Fix e > 0 and associate with each n the integer m that satisfies 


n — € 

m <— — < m + 1. 

1 + e 

Then ( m -1- 1 )/(n — m) < 1/e and \s„ — s t \ < Me. Hence 

lim sup | s n — <t| <Me. 

n -* 00 


Since e was arbitrary, lim s„ = a. 
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15. Definition 3.21 can be extended to the case in which the a n lie in some fixed R k . 
Absolute convergence is defined as convergence of E|a„|. Show that Theorems 
3.22, 3.23, 3.25(c), 3.33, 3.34, 3.42, 3.45, 3.47, and 3.55 are true in this more 
general setting. (Only slight modifications are required in any of the proofs.) 

16. Fix a positive number a. Choose Xi > Va, and define x 2 , x 3i * 4 , , by the 

recursion formula 


Xn + 1 



(a) Prove that {*,,} decreases monotonically and that lim x„ = Va. 

(b) Put e„ = x„ — Va , and show that 


, = _£L 

2x n 2V a 


so that, setting jS = 2 V a, 



(/»— 1, 2, 3,...). 


(c) This is a good algorithm for computing square roots, since the recursion 
formula is simple and the convergence is extremely rapid. For example, if a = 3 
and Xi = 2, show that ejp < j 1 ^ and that therefore 

e 5 <4 10- 16 , f 6 <4 • 10- 32 . 


17. Fix a > 1 . Take x t > Voc , and define 


« + X H 
1 + X, 


— x„ 


1 4 - A'fi 


(a) Prove that Xi > x* > x s > • • • . 

(b) Prove that x 2 < < x 6 < • • • . 

(c) Prove that lim = Va. 

(< d ) Compare the rapidity of convergence of this process with the one described 
in Exercise 16. 

18. Replace the recursion formula of Exercise 16 by 


where p is a fixed positive integer, and describe the behavior of the resulting 
sequences {.v„}. 

19. Associate to each sequence a = {<*„}, in which a„ is 0 or 2, the real number 


x(a) = 



Prove that the set of all x(a) is precisely the Cantor set described in Sec. 2.44. 
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20. Suppose {/>„} is a Cauchy sequence in a metric space X ’, and some subsequence 
{/>,,,} converges to a point pe X. Prove that the full sequence { p n } converges to p. 

21. Prove the following analogue of Theorem 3.10(6): If {E n ) is a sequence of closed 
and bounded sets in a complete metric space X, if E n E n + 1 , and if 

lim diam E n = 0, 

n-*oo 

then f) i° E n consists of exactly one point. 

22. Suppose A" is a complete metric space, and {(/„} is a sequence of dense open 
subsets of X. Prove Baire’s theorem, namely, that 0 i°^» * s not empty. (In fact, 
it is dense in X.) Hint: Find a shrinking sequence of neighborhoods E h such 
that E„ c G ny and apply Exercise 21. 

23. Suppose { p n } and {q n } are Cauchy sequences in a metric space X. Show that the 
sequence {d(p „ , q„)} converges. Hint: For any m, n , 

d(Pn > qn) ^ d(p „ , p m ) d{p m , q m ) -b d(q m , q n ) j 

it follows that 

I d(p n , qn) d(p m , q m ) \ 

is small if m and n are large. 

24. Let A" be a metric space. 

(a) Call two Cauchy sequences {/>„}, { q H } in X equivalent if 

lim d(p n , q„) = 0. 

»-»00 

Prove that this is an equivalence relation. 

C b ) Let X* be the set of all equivalence classes so obtained. If Pe A"*, Qe X*, 
{p,.)eP, {<?„} e Q, define 

A(P, Q) = lim d(p.,q.y, 

n -» oo 

by Exercise 23, this limit exists. Show that the number A (P, Q) is unchanged if 
{p„} and {q„} are replaced by equivalent sequences, and hence that A is a distance 
function in X *. 

(c) Prove that the resulting metric space X * is complete. 

(i d ) For each p e X, there is a Cauchy sequence all of whose terms are p; let P p 
be the element of X * which contains this sequence. Prove that 

A (P P ,P q )=d(p,q) 

for all p>q e X. In other words, the mapping <p defined by (p(p) = P p is an isometry 
(i.e., a distance-preserving mapping) of A' into X*. 

( e ) Prove that y(X) is dense in X *, and that y(X) = X* if X is complete. By ( d ), 
we may identify X and y(X) and thus regard X as embedded in the complete 
metric space X*. We call X * the completion of X. 

25. Let X be the metric space whose points are the rational numbers, with the metric 
d(x , y) = \x — y \ . What is the completion of this space? (Compare Exercise 24.) 



4 

CONTINUITY 


The function concept and some of the related terminology were introduced in 
Definitions 2. 1 and 2.2. Although we shall (in later chapters) be mainly interested 
in real and complex functions (i.e., in functions whose values are real or complex 
numbers) we shall also discuss vector-valued functions (i.e., functions with 
values in R k ) and functions with values in an arbitrary metric space. The theo- 
rems we shall discuss in this general setting would not become any easier if we 
restricted ourselves to real functions, for instance, and it actually simplifies and 
clarifies the picture to discard unnecessary hypotheses and to state and prove 
theorems in an appropriately general context. 

The domains of definition of our functions will also be metric spaces, 
suitably specialized in various instances. 

LIMITS OF FUNCTIONS 

4.1 Definition Let X and Y be metric spaces; suppose E c= X,f maps E into 
Y , and p is a limit point of E. We write f(x) -+ q as x -► /?, or 

(1) lim f(x)=q 

x->p 
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if there is a point qe Y with the following property: For every 6>0 there 
exists a S > 0 such that 

(2) d Y (f(x), q) < e 
for all points x e E for which 

(3) 0 < d x (x, p) < d. 

The symbols d x and d Y refer to the distances in X and V, respectively. 

If X and/or Y are replaced by the real line, the complex plane, or by some 
euclidean space R k , the distances d x , d Y are of course replaced by absolute values, 
or by appropriate norms (see Sec. 2.16). 

It should be noted that p e X, but that p need not be a point of E 
in the above definition. Moreover, even if p e E, we may very well have 
f(p) + lim^ p /U). 

We can recast this definition in terms of limits of sequences: 

4.2 Theorem Let X, Y, E,fi and p be as in Definition 4.1. Then 

(4) lim/(x) = q 

x^p 

if and only if 

(5) lim /(/>„) = q 

n~* oc 

for every sequence {p n } in E such that 

(6) p n *P, lim p„=p- 

n~* oo 

Proof Suppose (4) holds. Choose {/?„} in E satisfying (6). Let £ > 0 
be given. Then there exists <5 > 0 such that d Y (f(x),q) < e if xeE 
and 0 < d x (x, p) < S. Also, there exists N such that n > N implies 
0 < d x (p n ,p) c S. Thus, for n>N , we have d Y (f(p n ),q)<e , which 
shows that (5) holds. 

Conversely, suppose (4) is false. Then there exists some £ > 0 such 
that for every 3 > 0 there exists a point x e E (depending on 3 ), for which 
dyifi*), <l) > £ but 0 < d x (x, p) < 3. Taking 3 n = \/n (n = 1, 2, 3, . . .), we 
thus find a sequence in E satisfying (6) for which (5) is false. 

Corollary If f has a limit at p, this limit is unique. 


This follows from Theorems 3.2 (b) and 4.2. 



CONTINUITY 85 


4.3 Definition Suppose we have two complex functions,/ and g , both defined 
on £. By / + g we mean the function which assigns to each point x of £ the 
number f(x) + g(x). Similarly we define the difference f — g, the product fg , 
and the quotient f/g of the two functions, with the understanding that the quo- 
tient is defined only at those points jc of E at which g(x) ^ 0. If/ assigns to each 
point jc of E the same number c , then / is said to be a constant function, or 
simply a constant, and we write / = c. If / and g are real functions, and if 
f(x) > g(x) for every x e E, we shall sometimes write f > g, for brevity. 

Similarly, if f and g map E into R k , we define f 4- g and f • g by 

(f + g)M = f(*) + g(*), (f • g)M = f(*) • g(*); 
and if A is a real number, (Af)(jc) = Af(x). 

4.4 Theorem Suppose E a X, a metric space , p is a limit point of £, / and g 
are complex functions on E , and 

lim f(x) = A , lim g(x) = B. 

x->p x-*p 

Then (a) lim (/ + g)(x) = A + B; 

(b) Tin i(fg)(x)=AB; 

x->p 

(c) lim (-W) = 4> if B # o. 

x-*p \9! & 

Proof In view of Theorem 4.2, these assertions follow immediately from 
the analogous properties of sequences (Theorem 3.3). 

Remark If f and g map E into R k , then (a) remains true, and (£) becomes 
ip 9 ) lim (f • g)(x) = A • B. 

x-*p 

(Compare Theorem 3.4.) 


CONTINUOUS FUNCTIONS 

4.5 Definition Suppose X and Y are metric spaces, E a X, p e £, and /maps 
E into Y. Then / is said to be continuous at p if for every e > 0 there exists a 
S > 0 such that 

Mf(x)J(P)) < e 

for all points jc e E for which d x (x, p) < S. 

If /is continuous at every point of £, then /is said to be continuous on £. 
It should be noted that /has to be defined at the point p in order to be 
continuous at p. (Compare this with the remark following Definition 4.1.) 
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If p is an isolated point of £, then our definition implies that every function 
/ which has E as its domain of definition is continuous at p. For, no matter 
which e > 0 we choose, we can pick 6 > 0 so that the only point jc e E for which 
d x (x, p) < 3 is x = p\ then 

dy(f{x)J(p)) = 0 < e. 

4.6 Theorem In the situation given in Definition 4.5, assume also that p is a 
limit point of E. Then f is continuous at p if and only if\\m x ^ p f(x) = /(/?). 

Proof This is clear if we compare Definitions 4.1 and 4.5. 

We now turn to compositions of functions. A brief statement of the 
following theorem is that a continuous function of a continuous function is 
continuous. 

4.7 Theorem Suppose X , Y, Z are metric spaces , E a X, f maps E into Y, g 
maps the range of ff{E ), info Z 9 and h is the mapping of E into Z defined by 

h{x) = g(f(x)) (xeE). 

If f is continuous at a point p e E and if g is continuous at the point /(/?), then h is 
continuous at p. 

This function h is called the composition or the composite of / and g. The 
notation 

h =g°f 

is frequently used in this context. 

Proof Let e > 0 be given. Since g is continuous at f(p), there exists 
rj > 0 such that 

dz(9(y), 9(f(p))) < e if d Y (y,f(p)) < rj and y e /(£). 

Since /is continuous at /?, there exists S > 0 such that 

/(/>)) < *1 if d x (x , p) < S and * e E. 

It follows that 

d z (Kx), Kp)) = d z (g(f(x)), g(f(p))) < e 
if d x f c, p) < S and x e E. Thus h is continuous at p. 

4.8 Theorem A mapping f of a metric space X into a metric space Y is con- 
tinuous on X if and only if f~ l (V) is open in X for every open set V in Y. 

(Inverse images are defined in Definition 2.2.) This is a very useful charac- 
terization of continuity. 
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Proof Suppose /is continuous on X and V is an open set in Y. We have 
to show that every point of f~\V) is an interior point of / _1 (K). So, 
suppose p e X and f(p) e V. Since V is open, there exists e > 0 such that 
y e V if d Y (f(p), y) < e\ and since / is continuous at /?, there exists S > 0 
such that d Y (f(x),f(p)) < e if d x (x, p) < 3. Thus x e/ _1 (K) as soon as 
d x (x , p) < S. 

Conversely, suppose / -1 (K) is open in X for every open set V in Y. 
Fix p e X and e > 0, let V be the set of all y e Y such that d Y (y,f(p)) < e. 
Then V is open; hence f~\V) is open; hence there exists S > 0 such that 
x Ef~\V) as soon as d x (p, x) < 3. But if jce / _1 (K), then f(x) e V, so 
that d Y (f(x),f(p)) < e. 

This completes the proof. 

Corollary A mapping f of a metric space X into a metric space Y is continuous if 
and only if f~ l (C) is closed in X for every closed set C in Y. 

This follows from the theorem, since a set is closed if and only if its com- 
plement is open, and since / _1 (£ c ) = [f~ 1 (E)] c for every E a Y. 

We now turn to complex-valued and vector-valued functions, and to 
functions defined on subsets of R k . 

4.9 Theorem Let f and g be complex continuous functions on a metric space X . 
Then f + g,fg, and f/g are continuous on X. 

In the last case, we must of course assume that g(x) ^ 0, for all xe X. 

Proof At isolated points of X there is nothing to prove. At limit points, 
the statement follows from Theorems 4.4 and 4.6. 

4.10 Theorem 

(a) Let / l5 . . . 9 f k be real functions on a metric space X , and let f be the 
mapping of X into R k defined by 

(7) fW=(/i (*),...,/*(*)) (*eA0; 

then f is continuous if and only if each of the functions /, ...,f k is continuous . 

(b) If f and g are continuous mappings of X into R k , then f + g and f • g 
are continuous on X. 

The functions /, . . . , f k are called the components of f. Note that 
f + g is a mapping into R k , whereas f • g is a real function on X. 
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Proof Part (a) follows from the inequalities 

i m -fj(y) i < ifw - fwi ={ £ \/ix) -/^i 2 }*, 
for y = 1, . . . , k. Part ( b ) follows from (a) and Theorem 4.9. 

4.11 Examples If x l9 ..., x k are the coordinates of the point xeR k , the 
functions fa defined by 

(8) (f>i(\) =x t (x e R k ) 
are continuous on R k , since the inequality 

1 0,(x) — 0 f (y)l ^ |x — y | 

shows that we may take S = e in Definition 4.5. The functions are sometimes 
called the coordinate functions. 

Repeated application of Theorem 4.9 then shows that every monomial 

(9) xT^x^ . . . xp 

where n l9 . .., n k are nonnegative integers, is continuous on R k . The same is 
true of constant multiples of (9), since constants are evidently continuous. It 
follows that every polynomial P , given by 

(10) P(x)='Lc nr .. nk x n 1 l ...x n k k (xeR k ), 

is continuous on R k . Here the coefficients c Ht ...„ k are complex numbers, n l9 . . . , n k 
are nonnegative integers, and the sum in (10) has finitely many terms. 

Furthermore, every rational function in jt l5 . . . , x k , that is, every quotient 
of two polynomials of the form (10), is continuous on R k wherever the denomi- 
nator is different from zero. 

From the triangle inequality one sees easily that 

(11) ||x| - |y| I < |x — y I ( x,yeR k ). 

Hence the mapping x -► |x| is a continuous real function on R k . 

If now f is a continuous mapping from a metric space X into R k , and if <f) 
is defined on X by setting <f>(p) = |f(/?)|, it follows, by Theorem 4.7, that <f> is a 
continuous real function on X. 

4.12 Remark We defined the notion of continuity for functions defined on a 
subset E of a metric space X. However, the complement of £ in A" plays no 
role whatever in this definition (note that the situation was somewhat different 
for limits of functions). Accordingly, we lose nothing of interest by discarding 
the complement of the domain of /. This means that we may just as well talk 
only about continuous mappings of one metric space into another, rather than 
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of mappings of subsets. This simplifies statements and proofs of some theorems. 
We have already made use of this principle in Theorems 4.8 to 4.10, and will 
continue to do so in the following section on compactness. 


CONTINUITY AND COMPACTNESS 

4.13 Definition A mapping f of a set £ into R k is said to be bounded if there is 
a real number M such that | f(x) | < M for all * e E. 

4.14 Theorem Suppose f is a continuous mapping of a compact metric space 
X into a metric space Y. Then f(X) is compact. 

Proof Let {V a } be an open cover of f{X). Since /is continuous, Theorem 
4.8 shows that each of the sets / _1 (K a ) is open. Since X is compact, 
there are finitely many indices, say a,, . . . , a„, such that 

(12) X<=f-'(V.,)v ••• uf~'(VJ. 

Sine c f(f~\E)) c= E for every E c= Y, (12) implies that 

(13) f(X)a V"V-vV. H . 

This completes the proof. 

Note: We have used the relation /(/ _1 (£))c£. valid for £ c= Y. If 
£ c= X, then /" *(/(£)) => £; equality need not hold in either case. 

We shall now deduce some consequences of Theorem 4.14. 

4.15 Theorem If l is a continuous mapping of a compact metric space X into 
R k , then f(X) is closed and bounded. Thus , f is bounded. 

This follows from Theorem 2.41. The result is particularly important 
when /is real: 

4.16 Theorem Suppose f is a continuous real function on a compact metric 
space X , and 

(14) M = sup /(/?), m = inf f(p). 

P G X p 6 X 

Then there exist points p, q e X such that f(p) = M and f{q) = m. 

The notation in (14) means that M is the least upper bound of the set of 
all numbers J{p ), where p ranges over X , and that m is the greatest lower bound 
of this set of numbers. 
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The conclusion may also be stated as follows: There exist points p and q 
in X such that f(q) < f(x) < f(p) for all x e X\ that is, / attains its maximum 
(at p) and its minimum (at q). 

Proof By Theorem 4.15,/(A^) is a closed and bounded set of real num- 
bers; hence f(X) contains 

M = sup f(X) and m = inf/(A"), 

by Theorem 2.28. 

4.17 Theorem Suppose f is a continuous 1-1 mapping of a compact metric 
space X onto a metric space Y. Then the inverse mapping f~ 1 defined on Y by 

/"'(/(*)) =* {xeX) 

is a continuous mapping of Y onto X. 

Proof Applying Theorem 4.8 to/ -1 in place of f we see that it suffices 
to prove that f(V) is an open set in Y for every open set V in X. Fix such 
a set V. 

The complement V c of V is closed in X , hence compact (Theorem 
2.35); hence f(V c ) is a compact subset of Y (Theorem 4.14) and so is 
closed in Y (Theorem 2.34). Since /is one-to-one and onto, /(F) is the 
complement of /(F c ). Hence /(F) is open. 

4.18 Definition Let /be a mapping of a metric space X into a metric space Y. 
We say that / is uniformly continuous on X if for every e > 0 there exists 3 > 0 
such that 

(15) d Y (f(p),f(q))<e 

for all p and q in X for which d x (p, q) < 3 . 

Let us consider the differences between the concepts of continuity and of 
uniform continuity. First, uniform continuity is a property of a function on a 
set, whereas continuity can be defined at a single point. To ask whether a given 
function is uniformly continuous at a certain point is meaningless. Second, if 
/is continuous on X , then it is possible to find, for each e > 0 and for each 
point p of X, a number 3 > 0 having the property specified in Definition 4.5. This 
3 depends on e and on p . If / is, however, uniformly continuous on X , then it is 
possible, for each e > 0, to find one number 3 > 0 which will do for all points 
p of X. 

Evidently, every uniformly continuous function is continuous. That the 
two concepts are equivalent on compact sets follows from the next theorem. 
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4.19 Theorem Let f be a continuous mapping of a compact metric space X 
into a metric space Y. Then J is uniformly continuous on X. 

Proof Let e > 0 be given. Since / is continuous, we can associate to 
each point p e X a positive number <p(p) such that 

(16) q e X, d x (p, q) < 4>(P) implies d y (f( p), f(q)) < ^ ■ 

Let J(p) be the set of all q e X for which 

(17) d x (p,q) < \4>(p). 

Since peJ(p\ the collection of all sets J(p) is an open cover of X; and 
since X is compact, there is a finite set of points p u . . . , p n in X, such that 

(18) X<=/(p,) u ■■■ uJ(p„). 

We put 

(19) 5 = J min [</>(/>,),..., <Kp„)l 

Then <5 > 0. (This is one point where the finiteness of the covering, in- 
herent in the definition of compactness, is essential. The minimum of a 
finite set of positive numbers is positive, whereas the inf of an infinite set 
of positive numbers may very well be 0.) 

Now let q and p be points of X , such that d x (p , q) < <5. By (18), there 
is an integer m, 1 < m < n , such that p eJ(p m ); hence 

(20) d x (p, p m ) < \4>{p m ), 
and we also have 

d x (q, P,„) < d x (p, q) + d x (p,p m ) < 5 + {4>(Pm) < 4>(Pm)- 
Finally, (16) shows that therefore 

dy(f(p)J(q)) < dy(f(p)J(pJ) + dy(f(q)J(pJ) < e. 

This completes the proof. 

An alternative proof is sketched in Exercise 10. 

We now proceed to show that compactness is essential in the hypotheses 
of Theorems 4.14, 4.15, 4.16, and 4.19. 

4.20 Theorem Let E be a noncompact set in R x . Then 

(a) there exists a continuous function on E which is not bounded; 

(b) there exists a continuous and bounded function on E which has no 
maximum. 

If in addition , E is bounded , then 
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(c) there exists a continuous function on E which is not uniformly 
continuous. 

Proof Suppose first that E is bounded, so that there exists a limit point 
x 0 of E which is not a point of E. Consider 

(21) f{x) = — - — (xeE). 

x- x 0 

This is continuous on E (Theorem 4.9), but evidently unbounded. To see 
that (21) is not uniformly continuous, let e > 0 and <5 > 0 be arbitrary, and 
choose a point x e E such that \x — x 0 \ < 5. Taking t close enough to 
x 0 , we can then make the difference \f(t) —f(x) | greater than e, although 
1 1 — x\ < S. Since this is true for every S > 0,/is not uniformly continu- 
ous on E. 

The function g given by 

<22) ~ „)* ( * s£) 

is continuous on £, and is bounded, since 0 < g(x) <1. It is clear that 

sup g(x) = 1, 

jce £ 

whereas g{x) < 1 for all x e E. Thus g has no maximum on E. 

Having proved the theorem for bounded sets E. let us now suppose 
that E is unbounded. Then /(x) = x establishes ( a ), whereas 

(23) h(x) = f x 2 (xeE) 


establishes ( b ), since 


sup h(x) = 1 

xe E 


and h(x) <1 for all x e E. 

Assertion (c) would be false if boundedness were omitted from the 
hypotheses. For, let E be the set of all integers. Then every function 
defined on E is uniformly continuous on E. To see this, we need merely 
take d < 1 in Definition 4.18. 

We conclude this section by showing that compactness is also essential in 
Theorem 4.17. 
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4.21 Example Let X be the half-open interval [0, 2k) on the real line, and 
let f be the mapping of X onto the circle Y consisting of all points whose distance 
from the origin is 1, given by 

(24) f(/) = (cos t , sin t) (0 <> t < 2 k). 

The continuity of the trigonometric functions cosine and sine, as well as their 
periodicity properties, will be established in Chap. 8. These results show that 
f is a continuous 1-1 mapping of X onto Y. 

However, the inverse mapping (which exists, since f is one-to-one and 
onto) fails to be continuous at the point (1,0) = f(0). Of course, X is not com- 
pact in this example. (It may be of interest to observe that f -1 fails to be 
continuous in spite of the fact that Y is compact!) 


CONTINUITY AND CONNECTEDNESS 

4.22 Theorem If f is a continuous mapping of a metric space X into a metric 
space Y, and if E is a connected subset of X , then f(E) is connected. 

Proof Assume, on the contrary, that /(£) = A u B, where A and B are 
nonempty separated subsets of Y. Put G = Enf-'(A),H = Enf-'(B). 
Then E = G u //, and neither G nor H is empty. 

Since A <= A (the closure of A ), we have G <= f~ l (A); the latter set is 
closed, since /is continuous; hence G <= f~ l (A). It follows that f(G) a A. 
Since f(H) = B and A n B is empty, we conclude that G n H is empty. 

The same argument shows that G n H is empty. Thus G and H are 
separated. This is impossible if E is connected. 

4.23 Theorem Let f be a continuous real function on the interval [a, b]. If 
f(a) < f(b) and if c is a number such that f(a) < c < f(b), then there exists a 
point x e ( a , b) such that f(x) = c. 

A similar result holds, of course, if f(a)> f(b). Roughly speaking, the 
theorem says that a continuous real function assumes all intermediate values on 
an interval. 

Proof By Theorem 2.47, [a, b] is connected; hence Theorem 4.22 shows 
that f([a, £]) is a connected subset of R\ and the assertion follows if we 
appeal once more to Theorem 2.47. 

4.24 Remark At first glance, it might seem that Theorem 4.23 has a converse. 
That is, one might think that if for any two points x 1 < x 2 and for any number c 
between f(x { ) and f(x 2 ) there is a point x in (x 1? x 2 ) such that f(x) = c, then / 
must be continuous. 

That this is not so may be concluded from Example 4.21(d). 
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DISCONTINUITIES 

If x is a point in the domain of definition of the function / at which /is not 
continuous, we say that /is discontinuous at x, or that f has a discontinuity at x. 
If /is defined on an interval or on a segment, it is customary to divide discon- 
tinuities into two types. Before giving this classification, we have to define the 
right-hand and the left-hand limits of/at x, which we denote by/(x + ) and /(x -- ), 
respectively. 

4.25 Definition Let / be defined on ( a , b). Consider any point x such that 
a < x < b. We write 


/(*+) =q 

if f(t n ) Q as n -► oo, for all sequences {/,,} in (x, b) such that t n -> x. To obtain 
the definition of /(x — ), for a < x < b, we restrict ourselves to sequences {/„} in 
(a, x). 

It is clear that any point x of ( a , b), lim /(/) exists if and only if 

t-*x 

/(•* + ) =f(x~) = lim/(0. 


4.26 Definition Let /be defined on ( a , £). If /is discontinuous at a point x, 
and if /(x + ) and /(x — ) exist, then /is said to have a discontinuity of the /rsf 
kind, or a simple discontinuity , at x. Otherwise the discontinuity is said to be of 
the second kind. 

There are two ways in which a function can have a simple discontinuity: 
either /(x + ) ^ /(x — ) [in which case the value /(x) is immaterial], or /(x + ) = 

/(x-) ^/W. 

4.27 Examples 

(a) Define 


f( , _ f 1 (x rational), 

“ \0 (x irrational). 

Then /has a discontinuity of the second kind at every point x, since 
neither /(x + ) nor /(x— ) exists. 

(Z?) Define 

/*/ x \ x ( x rational), 

~ \0 (x irrational). 
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Then / is continuous at x = 0 and has a discontinuity of the second 
kind at every other point. 

(c) Define 


(x + 2 

/(*) =l-x-2 
x + 2 


(— 3 < x < — 2), 
( — 2 < x < 0), 

(0 < * < 1 ). 


Then / has a simple discontinuity at x = 0 and is continuous at 
every other point of ( — 3, 1). 

(< d ) Define 

/(o-K 

(o (A- - 0). 

Since neither /(0 + ) nor /( 0 — ) exists, / has a discontinuity of the 
second kind at x = 0. We have not yet shown that sin x is a continuous 
function. If we assume this result for the moment, Theorem 4.7 implies 
that / is continuous at every point x # 0. 


MONOTONIC FUNCTIONS 

We shall now study those functions which never decrease (or never increase) on 
a given segment. 

4.28 Definition Let / be real on ( a , b). Then / is said to be monotonically 
increasing on (a, b) if a < x < y < b implies /(x) < /(>’)• If the last inequality 
is reversed, we obtain the definition of a monotonically decreasing function. The 
class of monotonic functions consists of both the increasing and the decreasing 
functions. 

4.29 Theorem Let f be monotonically increasing on (a, b). Then /(x + ) and 
/(x — ) exist at every point of x of ( a , b). More precisely , 

(25) sup /(/)=/( x-) <f(x) </(* + ) = inf /(/). 

a<t < x x<t <b 

Furthermore , if a < x < y < b, then 

f(x + ) <f(y—). 

Analogous results evidently hold for monotonically decreasing functions. 


( 26 ) 
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Proof By hypothesis, the set of numbers f(t), where a < t < x, is bounded 
above by the number /(x), and therefore has a least upper bound which 
we shall denote by A. Evidently A < /(x). We have to show that 
A =/(*-)• 

Let e >0 be given. It follows from the definition of A as a least 
upper bound that there exists b > 0 such that a < x — b < x and 

A — s < f(x - b) < A. 

Since /is monotonic, we have 

f(x — b) </(0 < A (x - b < t < x). 

Combining (27) and (28), we see that 

|/(0 — A | < 6 (x — b < t < x). 

Hence /(x — ) = A. 

The second half of (25) is proved in precisely the same way. 

Next, if a < x < y < b, we see from (25) that 

/(* + )= inf f(t) = inf /(/). 

x<t<b x<t<y 

The last equality is obtained by applying (25) to (a, y) in place of ( a , b). 
Similarly, 

Ay-) = sup f(t) = sup f(t). 

a<t< y x<t<y 

Comparison of (29) and (30) gives (26). 

Corollary Monotonic functions have no discontinuities of the second kind. 

This corollary implies that every monotonic function is discontinuous at 
a countable set of points at most. Instead of appealing to the general theorem 
whose proof is sketched in Exercise 17, we give here a simple proof which is 
applicable to monotonic functions. 

4.30 Theorem Let f be monotonic on (a, b). Then the set of points of ( a , b) at 
which f is discontinuous is at most countable. 

Proof Suppose, for the sake of definiteness, that / is increasing, and 
let E be the set of points at which /is discontinuous. 

With every point .v of E we associate a rational number r(x) such 

that 


(27) 

(28) 

(29) 

(30) 


Ax-) < r(x) <f{x + ). 
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Since x l <x 2 implies f(x { +) <f(x 2 — ), we see that r(x,) / r(x 2 ) if 

x \ ^ x 2 . 

We have thus established a 1-1 correspondence between the set E and 

a subset of the set of rational numbers. The latter, as we know, is count- 
able. 

4.31 Remark It should be noted that the discontinuities of a monotonic 
function need not be isolated. In fact, given any countable subset E of ( a , b). 
which may even be dense, we can construct a function / monotonic on ( a , b ), 
discontinuous at every point of E. and at no other point of ( a , b). 

To show this, let the points of E be arranged in a sequence {*„}, 

n = 1, 2, 3, Let {c„} be a sequence of positive numbers such that Ic n 

converges. Define 

(31) /(*) = Y. c « (a < x < b). 

X n < X 

The summation is to be understood as follows: Sum over those indices n 
for which x n < x. If there are no points x n to the left of a\ the sum is empty; 
following the usual convention, we define it to be zero. Since (31) converges 
absolutely, the order in which the terms are arranged is immaterial. 

We leave the verification of the following properties of /to the reader: 

(tf) /is monotonically increasing on (a, b)\ 

(b) J' is discontinuous at every point of E\ in fact. 

/(*„ + ) -f( x n-) = c n . 

( c ) /is continuous at every other point of ( a , b). 

Moreover, it is not hard to see that/(x- ) =/(.v) at all points of (tf, b). If 
a function satisfies this condition, we say that / is continuous from the lejt. If 
the summation in (31) were taken over all indices n for which jc„ < x, we would 
have f(x + ) =f(x) at every point of ( a , b)\ that is. / would be continuous from 
the right. 

Functions of this sort can also be defined by another method; for an 
example we refer to Theorem 6.16. 


INFINITE LIMITS AND LIMITS AT INFINITY 

To enable us to operate in the extended real number system, we shall now 
enlarge the scope of Definition 4. 1 , by reformulating it in terms of neighborhoods. 

For any real number x, we have already defined a neighborhood of .v to 
be any segment (x — 6, x + 5). 
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4.32 Definition For any real c, the set of real numbers .v such that x > c is 
called a neighborhood of + oo and is written ( c . 4- oo). Similarly, the set ( — oo, c) 
is a neighborhood of — oo. 

4.33 Definition Let /be a real function defined on £. We say that 

/(/)-> A as / -► x, 

where A and .v are in the extended real number system, if for every neighborhood 
U of A there is a neighborhood V of x such that V n E is not empty, and such 
that /(/) e U for all t e V n E, t ^ x. 

A moment’s consideration will show that this coincides with Definition 
4.1 when A and x are real. 

The analogue of Theorem 4.4 is still true, and the proof offers nothing 
new. We state it, for the sake of completeness. 

4.34 Theorem Let f and g be defined on E. Suppose 

f(t)->A , g(t)->B ast->x. 

Then 

{a) A' implies A' = A. 

(b) (/ + g)(t )-> A + B, 

(c) (fg)(t )-> A B, 

(d) (flg)(t) A IB, 

provided the right members of (b), (c), and (d) are defined. 

Note that oc — oo, 0 • oo, oo/oo, A /0 are not defined (see Definition 1.23). 


EXERCISES 

1. Suppose /is a real function defined on R l which satisfies 

lim [f(x -f h) - f{x - h)] = 0 

for every .v e R\ Does this imply that /is continuous? 

2. If/is a continuous mapping of a metric space X into a metric space Y, prove that 

f(E)czJxE) 

for every set E c X. (E denotes the closure of E.) Show, by an example, that 
f(E) can be a proper subset of /(E). 

3. Let / be a continuous real function on a metric space X. Let Z (/) (the zero set of / ) 
be the set of all p e X at which f(p) = 0. Prove that Z(f) is closed. 

4. Let / and y be continuous mappings of a metric space X into a metric space T, 
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and let E be a dense subset of X. Prove that f(E) is dense in f(X). If g(p) = f(p) 
for all p e E, prove that g(p) =f(p) for all p e X. (In other words, a continuous 
mapping is determined by its values on a dense subset of its domain.) 

5. If/ is a real continuous function defined on a closed set E <= R\ prove that there 
exist continuous real functions g on R l such that g(x) = f(x) for all x e E. (Such 
functions g are called continuous extensions of / from E to R l .) Show that the 
result becomes false if the word “closed” is omitted. Extend the result to vector- 
valued functions. Hint: Let the graph of g be a straight line on each of the seg- 
ments which constitute the complement of E (compare Exercise 29, Chap. 2). 
The result remains true if R l is replaced by any metric space, but the proof is not 
so simple. 

6. If /is defined on E ', the graph of /is the set of points (.v, /(*)), for x g E. In partic- 
ular, if E is a set of real numbers, and / is real-valued, the graph of /is a subset of 
the plane. 

Suppose E is compact, and prove thax / is continuous on E if and only if 
its graph is compact. 

7. If E <= X and if/ is a function defined on X , the restriction of f to E is the function 
g whose domain of definition is £, such that g(p) = f(p) for p e E. Define / and g 
on R 2 by: /(0, 0) = #(0, 0) = 0, /(*, y) = xy 2 l(x 2 -f y 4 ), g(x, y) = xy 2 /(x 2 + y 6 ) 
if (jr, y) ^ (0, 0). Prove that / is bounded on R 2 ; that g is unbounded in every 
neighborhood of (0, 0), and that / is not continuous at (0,0); nevertheless, the 
restrictions of both / and g to every straight line in R 2 are continuous! 

8. Let /be a real uniformly continuous function on the bounded set E in R l . Prove 
that / is bounded on E. 

Show that the conclusion is false if boundedness of E is omitted from the 
hypothesis. 

9. Show that the requirement in the definition of uniform continuity can be rephrased 
as follows, in terms of diameters of sets: To every e > 0 there exists a 8 > 0 such 
that diam / (E) < e for all E c= X with diam E <8. 

10. Complete the details of the following alternative proof of Theorem 4.19: If /is not 
uniformly continuous, then for some e >0 there are sequences {/?„}, {g„j in X such 
that dx{p n ,<?„)-> 0 but d Y (f(p n ), /(#„)) > £• Use Theorem 2.37 to obtain a contra- 
diction. 

11. Suppose / is a uniformly continuous mapping of a metric space X into a metric 
space Y and prove that {/(jt„)} is a Cauchy sequence in Y for every Cauchy se- 
quence {x„} in X. Use this result to give an alternative proof of the theorem stated 
in Exercise 13. 

12. A uniformly continuous function of a uniformly continuous function is uniformly 
continuous. 

State this more precisely and prove it. 

13. Let E be a dense subset of a metric space X , and let /be a uniformly continuous 
real function defined on E. Prove that /has a continuous extension from E to X 
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(see Exercise 5 for terminology). (Uniqueness follows from Exercise 4.) Hint: For 
each p e X and each positive integer n 9 let V n (p) be the set of all q e E with 
d{p, q) < 1 /n. Use Exercise 9 to show that the intersection of the closures of the 
sets f(Vi(j))\ f{V 2 {p)\ ...» consists of a single point, say g(p\ of RK Prove that 
the function g so defined on X is the desired extension of /. 

Could the range space R L be replaced by R k ? By any compact metric space ? 
By any complete metric space ? By any metric space ? 

14 . Let / = [0, 1] be the closed unit interval. Suppose /is a continuous mapping of I 
into I. Prove that /(x) — x for at least one xe I. 

15 . Call a mapping of X into Y open if /(F) is an open set in Y whenever V is an open 
set in X . 

Prove that every continuous open mapping of R l into R l is monotonic. 

16 . Let [x] denote the largest integer contained in x, that is, [x] is the integer such 
that x — 1 < [x] < x; and let (x) = x — [x] denote the fractional part of x. What 
discontinuities do the functions [x] and (x) have? 

17 . Let / be a real function defined on (a 9 b). Prove that the set of points at which / 
has a simple discontinuity is at most countable. Hint: Let E be the set on which 
/(x— ) </(x+). With each point x of E , associate a triple (p 9 q 9 r) of rational 
numbers such that 

(a) /(*-) <p </(*+)> 

(b) a <q <t <x implies /(/) < p 9 

(c) x < t <r <b implies f(t) > p . 

The set of all such triples is countable. Show that each triple is associated with at 
most one point of E. Deal similarly with the other possible types of simple dis- 
continuities. 

18 . Every rational x can be written in the form x = mln 9 where n > 0, and m and n are 
integers without any common divisors. When x = 0, we take n = 1. Consider the 
function / defined on R 1 by 


/(*) = 


0 

(x irrational), 

1 

L-2) 

n 

\ 

\ «;• 


Prove that/ is continuous at every irrational point, and that / has a simple discon- 
tinuity at every rational point. 

19 . Suppose / is a real function with domain R l which has the intermediate value 
property: If f(a) <c < f(b\ then /(x) — c for some x between a and b. 

Suppose also, for every rational r, that the set of all x with /(x) = r is closed. 

Prove that / is continuous. 

Hint: If x„ -> x 0 but /(x„) > r > /(x 0 ) for some r and all n 9 then f(t») = r 
for some t» between x 0 and x h ; thus t» ->x 0 . Find a contradiction. (N. J. Fine, 
Amer . Math. Monthly 9 vol. 73, 1966, p. 782.) 
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20. If E is a nonempty subset of a metric space X, define the distance from x e X to E 
by 


p E (x) = inf d(x, z). 

zeE 


(a) Prove that p E (x ) = 0 if and only if x e E. 

(b) Prove that p E is a uniformly continuous function on X , by showing that 

| Pe(x) — p E (y) | <d(x,y) 


for all x e X 9 ye X. 

Hint : p E (x) < d(x, z) < d(x, y) + d(y, z ), so that 


pE(x)<d(x , y) + p E (y). 

21. Suppose K and F are disjoint sets in a metric space X , K is compact, F is closed. 
Prove that there exists 8 > 0 such that d(p, q) > 8 if p e K, q e F. Hint: p F is a 
continuous positive function on K. 

Show that the conclusion may fail for two disjoint closed sets if neither is 
compact. 

22. Let A and B be disjoint nonempty closed sets in a metric space X , and define 


m = 


(Up) 

Pa(p) + ps(p) 


(peX). 


Show that / is a continuous function on X whose range lies in [0, 1], that f(p) = 0 
precisely on A and f{p) = 1 precisely on B. This establishes a converse of Exercise 
3: Every closed set A c= X is Z(f) for some continuous real / on X. Setting 

V=f~\[0A)) W=f-'((i ) 1]), 


show that V and W are open and disjoint, and that A V, B c= W. (Thus pairs of 
disjoint closed sets in a metric space can be covered by pairs of disjoint open sets. 
This property of metric spaces is called normality.) 

23. A real-valued function / defined in ( a , b) is said to be convex if 

/( Ax + (1 - A )y) < A /(*) + (1 - A )f(y) 

whenever a < x <b, a <y < b, 0<A<1. Prove that every convex function is 
continuous. Prove that every increasing convex function of a convex function is 
convex. (For example, if /is convex, so is e f .) 

If /is convex in ( a , b) and if a < s < t < u < b, show that 

m -As) < m -ns) < m-m 

t — S ~ U — S — 14 ~ t 


24. Assume that f \? a continuous real function defined in ( a , b) such that 


x + y \ ^ fix) +f(y) 
2 ) 2 


for all x, y e ( a , b). Prove that / is convex. 
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25 . If A c R k and B R k , define A + B to be the set of all sums x-fy with xe A, 
y e B. 

(a) If K is compact and C is closed in R k , prove that K + C is closed. 

Hint: Take z £ A + C, put F= z — C, the set of all z — y with y e C. Then 
A" and Fare disjoint. Choose 8 as in Exercise 21. Show that the open ball with 
center z and radius 8 does not intersect K + C. 

( b ) Let a be an irrational real number. Let C x be the set of all integers, let C 2 be 
the set of all na. with n e C x . Show that C x and C 2 are closed subsets of R l whose 
sum Cj + C 2 is not closed, by showing that C x + C 2 is a countable dense subset 
of R l . 

26 . Suppose X ", T, Z are metric spaces, and Y is compact. Let / map X into Y , let 
# be a continuous one-to-one mapping of Y into Z, and put h(x) — g(f(x)) for 
xeX. 

Prove that / is uniformly continuous if h is uniformly continuous. 

Hint: g~ l has compact domain g{Y ), and f(x) = g~ l (h(x)). 

Prove also that /is continuous if h is continuous. 

Show (by modifying Example 4.21, or by finding a different example) that 
the compactness of Y cannot be omitted from the hypotheses, even when X and 
Z are compact. 



5 

DIFFERENTIATION 


In this chapter we shall (except in the final section) confine our attention to real 
functions defined on intervals or segments. This is not just a matter of con- 
venience, since genuine differences appear when we pass from real functioris to 
vector-valued ones. Differentiation of functions defined on R k will be discussed 
in Chap. 9. 


THE DERIVATIVE OF A REAL FUNCTION 

5.1 Definition Let /be defined (and real-valued) on [« a , b]. For any x e [a, b] 
form the quotient 

(1) <p(t ) = (a<t <b,t #x), 

f\x) = lim <p(t), 


and define 
( 2 ) 
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provided this limit exists in accordance with Definition 4.1. 

We thus associate with the function / a function /' whose domain 
is the set of points x at which the limit (2) exists; /' is called the derivative 
of/. 

If /' is defined at a point x , we say that / is differentiable at x. If /' is 
defined at every point of a set E <= [ a , b] 9 we say that / is differentiable on E. 

It is possible to consider right-hand and left-hand limits in (2); this leads 
to the definition of right-hand and left-hand derivatives. In particular, at the 
endpoints a and b 9 the derivative, if it exists, is a right-hand or left-hand deriva- 
tive, respectively. We shall not, however, discuss one-sided derivatives in any 
detail. 

If / is defined on a segment (< a , b) and if a < x < b 9 then f\x) is defined 
by (1) and (2), as above. But f'{a) and f\b) are not defined in this case. 


5.2 Theorem Let f be defined on [a, b]. Iff is differentiable at a point x e [a 9 b] 9 
then f is continuous at x. 

Proof As t -*x, we have, by Theorem 4.4, 

fit ) - fix) = /(/ ^- •('-*) -/'(*) 0 = 0. 

The converse of this theorem is not true. It is easy to construct continuous 
functions which fail to be differentiable at isolated points. In Chap. 7 we shall 
even become acquainted with a function which is continuous on the whole line 
without being differentiable at any point! 


5.3 Theorem Suppose f and g are defined on [< a , b] and are differentiable at a 
point x e [ a , b]. Then f + g,fg, and fig are differentiable at x, and 

{a) {f+g)\x)=f\x) + g\xy 9 

(b) ( fg)\x ) = f(x)g(x) + f(x)g'(x ) ; 

ft 

\9! g{x) 

In (c), we assume of course that g{x) / 0. 


Proof (a) is clear, by Theorem 4.4. Let h = fg. Then 

h(t) - h(x) =f(t)[g(t) - g(x)] + g(x)[fit ) -/(*)]. 
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If we divide this by t — x and note that /(/ ) ->/(x) as t -► x (Theorem 5.2), 
(b) follows. Next, let h = fjg. Then 


h(t)-h(x) 
t - x 


1 


g(<)g(x) 


, Jit) -fix) „ ^git)-g(x)] 


Letting t -► x , and applying Theorems 4.4 and 5.2, we obtain (c). 


5.4 Examples The derivative of any constant is clearly zero. If/ is defined 
by fix) = x, then/'(x) = 1. Repeated application of (b) and (c) then shows that 
x n is differentiable, and that its derivative is nx n ~ l , for any integer n (if n < 0, 
we have to restrict ourselves to x ^ 0). Thus every polynomial is differentiable, 
and so is every rational function, except at the points where the denominator is 
zero. 

The following theorem is known as the “chain rule” for differentiation. 
It deals with differentiation of composite functions and is probably the most 
important theorem about derivatives. We shall meet more general versions of it 
in Chap. 9. 


5.5 Theorem Suppose f is continuous on [a, b] y f\x) exists at some point 
x e [( a , b], g is defined on an interval I w hich contains the range of f and g is 
differentiable at the point f(x). If 

h(t ) = g(f(t )) (a < t < b), 
then h is differentiable at x, and 

(3) h\x)=g\f(x))f\x). 

Proof Let y = f{x). By the definition of the derivative, we have 

(4) fit ) -f(x) =it- x)[f\x ) + uit )], 

(5) gis) - g(y) = (s- y)[g'{y) + t’(s)], 

where t e [a, 6], s e I, and uit ) -* 0 as t -► x, vis) -► 0 as s -* y. Let s = fit). 
Using first (5) and then (4), we obtain 

hit)-hix)=gifit))-gifix)) 

= Uit) -/(*)] • [g'iy) + v(^)] 

= (t-x)- [f'(x) + «(/)] • [g'iy) + Kj)], 

or, if t # x, 

(6) h ^ffff^ = [g'iy) + vis)][f'ix) + u(t)\. 

Letting t -> x, we see that s -► y, by the continuity of /, so that the right 
side of (6) tends to g\y)f\x ), which gives (3). 
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5.6 

Examples 



(a) Let /be defined by 



{ . 1 
\x sin — 


(7) 

fix) = X 



lo 

(x = 0). 


Taking for granted that the derivative of sin x is cos jc (we shall 
discuss the trigonometric functions in Chap. 8), we can apply Theorems 
5.3 and 5.5 whenever x # 0, and obtain 

(8) /'(*) = sin - — -cos — (x # 0). 

xxx 


At x = 0, these theorems do not apply any longer, since \/x is not defined 
there, and we appeal directly to the definition: for t # 0, 


m -m 

t - 0 


. 1 
sin 

t 


As 0, this does not tend to any limit, so that/'(0) does not exist. 
(b) Let /be defined by 


(9) 


( 10 ) 


fix) 



(*# 0 ), 

(x = 0), 


As above, we obtain 


f\x) = 2x sin - — cos — 
xx 


(x#0). 


At x = 0, we appeal to the definition, and obtain 


m -m 

t-0 


. i 
t sin - 
t 


< M 




letting t -> 0, we see that 


do 


/'( 0 ) = 0 . 


Thus / is differentiable at all points x, but /' is not a continuous 
function, since cos (\/x) in (10) does not tend to a limit as x -+ 0. 
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MEAN VALUE THEOREMS 


5.7 Definition Let / be a real function defined on a metric space X. We say 
that /has a local maximum at a point p e X if there exists <5 > 0 such that f(q) < 
f(p) for all q e X with d(p, q) < S. 

Local minima are defined likewise. 

Our next theorem is the basis of many applications of differentiation. 


5.8 Theorem Let f be defined on [< a , b ] ; if f has a local maximum at a point 
x e ( a , b ), and if f\x ) exists , then f\x) = 0. 

The analogous statement for local minima is of course also true. 

Proof Choose S in accordance with Definition 5.7, so that 

a < x — d < x < x + 6 <b. 

If x — S < t < x, then 

mzm > o. 

t - X 


Letting / -► x, we see that f\x ) > 0. 
If x < t < x + <5, then 


AO -fix) 

t — X 


< 0 , 


which shows that f\x) < 0. Hence f\x) = 0. 


5.9 Theorem If f and g are continuous real functions on [a, b] which are 
differentiable in ( a , b ), then there is a point x e (< a , b) at which 

[m -m]g\x) = [gib) - g{a)]f\x). 

Note that differentiability is not required at the endpoints. 

Proof Put 

hit) = [. f{b ) -f(a)]g(t) - [g(b) - g(a)]f(t) (a < t < b). 

Then h is continuous on [< a , b ], h is differentiable in (a, b ), and 

( 1 2) h(a) = f{b)g{a) - f{a)g{b) = h(b). 

To prove the theorem, we have to show that h\x) = 0 for some x e (a, b). 

If h is constant, this holds for every xe(a,b). If h{t) > h(a) for 
some t e ( a , b), let * be a point on [a, b] at which h attains its maximum 
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(Theorem 4. 16). By (12), x e ( a , b ), and Theorem 5.8 shows that h\x) = 0. 
If hit) < h{a) for some t e ( a , 6), the same argument applies if we choose 
for x a point on [a, b] where h attains its minimum. 

This theorem is often called a generalized wean value theorem ; the following 
special case is usually referred to as “the” mean value theorem: 

5.10 Theorem If f is a real continuous function on [ a , b\ w hich is differentiable 
in (a, b ), then there is a point x e (a, b) at w hich 

m -f(a) = (b - a) f(x). 

Proof Take g(x) = x in Theorem 5.9. 

5.11 Theorem Suppose f is differentiable in ( a , b). 

(a) If f\x) > 0 for all x e (< a , b ), then f is monotonically increasing. 

(b) If f'(x) — 0 for all x e ( a , b ), then f is constant. 

(c) If f\x) < 0 for all x e ( a , b ), then f is monotonically decreasing. 

Proof All conclusions can be read off from the equation 

/(* 2 ) “/(* l ) = (*2 - *1 )/'(*). 

which is valid, for each pair of numbers x l9 x 2 in (a, b ), for some x between 
x { and x 2 . 


THE CONTINUITY OF DERIVATIVES 

We have already seen [Example 5.6(6)] that a function / may have a derivative 
/' which exists at every point, but is discontinuous at some point. However, not 
every function is a derivative. In particular, derivatives which exist at every 
point of an interval have one important property in common with functions 
which are continuous on an interval: Intermediate values are assumed (compare 
Theorem 4.23). The precise statement follows. 

5.12 Theorem Suppose f is a real differentiable function on [ a , b] and suppose 
f\a) < A < f\b). Then there is a point x e ( a , b) such that f'{x) = A. 

A similar result holds of course if f'{a) > f\b). 

Proof Put g(t) = f(t) — It. Then g\a) < 0, so that g{ty) < g(a) for some 
t x e ( a , 6), and g\b) > 0, so that g(t 2 ) < g(b) for some t 2 e ( a , b). Hence 
g attains its minimum on [< a , b] (Theorem 4.16) at some point x such that 
a < x <b. By Theorem 5.8, g\x) = 0. Hence f\x) = A. 
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Corollary If f is differentiable on [a, b], then f' cannot have any simple dis- 
continuities on [ a , b]. 

But /' may very well have discontinuities of the second kind. 


L’HOSPITAL’S RULE 

The following theorem is frequently useful in the evaluation of limits. 


5.13 Theorem Suppose f and g are real and differentiable in (i a , b), and g\x) # 0 
for all x e ( a , b ), where — oo < a < b < + oo. Suppose 


03 ) 

If 

(14) 

or if 

( 15 ) 


/'(*) . 

— ► A as x -> a. 

9 (•*) 

f(x) -+ 0 and g(x ) -> 0 as x -* a, 
g(x) -*• +oo as x -* a, 


then 

(16) 


fix) 

—— -* A as x -fa. 

g(x) 


The analogous statement is of course also true if x-+ b, or if g(x) -> — oo 
in (15). Let us note that we now use the limit concept in the extended sense of 
Definition 4.33. 


Proof We first consider the case in which — oo <A < +oo. Choose a 
real number q such that A < q, and then choose r such that A < r < q. 
By (13) there is a point c e (< a , b) such that a < x < c implies 


(17) 


fix) 

g\x) 


< r. 


If a < x < y < c, then Theorem 5.9 shows that there is a point / e (x, y) 
such that 


(18) 


fix) -fiy) fit) c r 
gix) - g(y) g\t ) r 


Suppose (14) holds. Letting x a in (18), we see that 


fiy) 

giy) 


< r <q 


(a < y < c ). 


(19) 
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Next, suppose (15) holds. Keeping y fixed in (18), we can choose 
a point q e ( a , y) such that g(x ) > g{y) and g(x) > 0 if a < x < c Y . Multi- 
plying (18) by [g(x) - g(y)]/g(x), we obtain 


( 20 ) 


/(*) c r r 9(y) | f(y) 

g(x) g(x) g(x) 


(a<x< c,). 


If we let x-> a in (20), (15) shows that there is a point c 2 e (a, c ,) 
such that 


(21) ( a<x<c 2 ). 

g(x) 

Summing up, (19) and (21) show that for any q , subject only to the 
condition A < q, there is a point c 2 such that f(x)/g(x) < q if a < x < c 2 . 

In the same manner, if — oo < A < + oo, and p is chosen so that 
p < A, we can find a point c 3 such that 

f(x) 

(22) p < — (a < x < c 3 ), 

and (16) follows from these two statements. 


DERIVATIVES OF HIGHER ORDER 

5.14 Definition If /has a derivative /' on an interval, and if /' is itself differen- 
tiable, we denote the derivative of /' by /" and call /" the second derivative of /. 
Continuing in this manner, we obtain functions 

//',/", / (3) ,...,/ ( "\ 

each of which is the derivative of the preceding one. / (n) is called the nth deriva- 
tive, or the derivative of order n, of /. 

In order for f {n) (x) to exist at a point x,f (n ~ ° (/) must exist in a neighbor- 
hood of x (or in a one-sided neighborhood, if x is an endpoint of the interval 
on which /is defined), and / (n_n must be differentiable at x. Since / (n_1) must 
exist in a neighborhood of xj' (n ~ 2) must be differentiable in that neighborhood. 


TAYLOR’S THEOREM 


5.15 Theorem Suppose f is a real function on [a, b], n is a positive integer , 
/ (n_ 1} is continuous on [a, b],f {n) (t) exists for every t e (< a , b). Let a, /? be distinct 
points of [i a , 6], and define 


P(t) = 


y/ w («) 

*=0 *1 


(f - a)*- 


(23) 


k\ 
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Then there exists a point x betw een a and P such that 

(24) /(/?) = />(/?)+ — ^08- a)". 

nl 

For n = 1, this is just the mean value theorem. In general, the theorem 
shows that / can be approximated by a polynomial of degree n — 1, and that 

(24) allows us to estimate the error, if we know bounds on |/ ( '°(x)|. 

Proof Let M be the number defined by 

(25) m = P(P) + M(P - a) n 
and put 

(26) g(t )=/(/)- P(t ) - M(t -a)" (a < t < b ). 

We have to show that n\M = f {n) (x) for some x between a and p. By 
(23) and (26), 

(27) g (n \t ) = f in \t ) — n\M (a < t < b). 

Hence the proof will be complete if we can show that g (n \x) = 0 for some 
x between a and p. 

Since P (k \ a) = f {k \ a) for k = 0, ...,/? — 1 , we have 

(28) ^(a) = g'(a) = • • • = 5 ( ” ' 1 >(a ) = 0. 

Our choice of M shows that g(p) = 0, so that g\x x ) = 0 for some x x 
between cx and /?, by the mean value theorem. Since g'(y) = 0, we conclude 
similarly that g\x 2 ) = 0 for some x 2 between a and x x . After n steps we 
arrive at the conclusion that g (n \x n ) = 0 for some x n between a and x n ^ x , 
that is, between a and p. 


DIFFERENTIATION OF VECTOR-VALUED FUNCTIONS 

5.16 Remarks Definition 5. 1 applies without any change to complex functions 
/ defined on [a, ft], and Theorems 5.2 and 5.3, as well as their proofs, remain 
valid. If f x and f 2 are the real and imaginary parts of / that is, if 

/( 0 =/i (0 + 1/2(0 

for a < t < ft, where f x (t) and f 2 (t) are real, then we clearly have 

(29) f'(x) =f[(x) + ifi(x)\ 

also, /is differentiable at x if and only if both f x and f 2 are differentiable at x. 



112 PRINCIPLES OF MATHEMATICAL ANALYSIS 


Passing to vector-valued functions in general, i.e., to functions f which 
map [a, b] into some R k , we may still apply Definition 5.1 to define f'(*). The 
term (f>(t ) in (1) is now, for each t, a point in R k , and the limit in (2) is taken with 
respect to the norm of R k . In other words, f '(*) is that point of R k (if there is 
one) for which 


lf(0-f(*) 


-f'(x) =0, 


and f' is again a function with values in R k . 

If /j, ...,/* are the components of f, as defined in Theorem 4.10, then 

(3i) r = /;), 


and f is differentiable at a point x if and only if each of the functions f x , ... . ,f k 
is differentiable at x. 

Theorem 5.2 is true in this context as well, and so is Theorem 5.3 (a) and 
( b ), if fg is replaced by the inner product f • g (see Definition 4.3). 

When we turn to the mean value theorem, however, and to one of its 
consequences, namely, L’Hospital’s rule, the situation changes. The next two 
examples will show that each of these results fails to be true for complex-valued 
functions. 


5.17 Example Define, for real x , 

(32) f{x) = e lx = cos x -|- i sin x. 

(The last expression may be taken as the definition of the complex exponential 
e ix ; see Chap. 8 for a full discussion of these functions.) Then 

(33) f(2n) -/(0) = 1 - 1 =0, 
but 

(34) f\x) = ie ix , 

so that \f'(x)\ = 1 for all real x. 

Thus Theorem 5.10 fails to hold in this case. 


5.18 Example On the segment (0, 1), define f(x) = x and 
(35) g(x ) = x 4 *V /X \ 

Since |e"| =1 for all real t, we see that 



( 36 ) 
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Next, 

(37) 

so that 

(38) 
Hence 

(39) 
and so 

(40) 


9\x) = I + 



(0 < * < 1 ), 


l^'WI > 




fix) = 1 < X 

g'(x) \g\x)\~2-x 


lim 


fix) 

< 9'{x) 


= 0 . 


By (36) and (40), L/HospitaFs rule fails in this case. Note also that g'(x) ^0 
on (0, 1), by (38). 

However, there is a consequence of the mean value theorem which, for 
purposes of applications, is almost as useful as Theorem 5.10, and which re- 
mains true for vector-valued functions: From Theorem 5.10 it follows that 


(41) \f(b) -f(a)\< (b - a) sup |/'(*)i- 

a<x<b 


5.19 Theorem Suppose f is a continuous mapping of [ a . b] into R k and f is 
differentiable in ( a , b). Then there exists x e (*, b) such that 

|f(6) - f(a)| < (b - a) | f '(-*)!. 

Proof 1 Put z = f (b) - f(a), and define 

</>(/) = z • f(/) (a < t < b). 

Then cp is a real-valued continuous function on [ a , b] which is differentia- 
ble in (a, b). The mean value theorem shows therefore that 

tp(b) - <p(d) = (b - a)(p\x) = (b - a ) z • f' (*) 

for some x e ( a , b). On the other hand, 

<p(b) - cp(a) = z • f (b) -z • f(^7) == z • z = |z| 2 . 

The Schwarz inequality now gives 

|z| 2 = (b - a ) |z • f'(x)| <(b- a) |z| |f(.v)|. 

Hence |z| <(b — tf)|f(jc)|, which is the desired conclusion. 


1 V. P. Havin translated the second edition of this book into Russian and added this 
proof to the original one. 
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EXERCISES 


1. Let /be defined for all real a*, and suppose that 

I/to -/O') I <(x-y ) 2 

for all real x and y. Prove that /is constant. 

2. Suppose /'(» > 0 in ( a , 6). Prove that /is strictly increasing in ( a , 6), and let # be 
its inverse function. Prove that g is differentiable, and that 

g'(f(x)) = -f— (a <X < b). 

f O) 

3. Suppose g is a real function on R\ with bounded derivative (say \g'\ < A/). Fix 
e > 0, and define fix) = * + £#(*). Prove that / is one-to-one if £ is small enough. 
(A set of admissible values of e can be determined which depends only on M.) 

4. If 


Co 




C_ 

+ 


= 0 , 


where C 0 , C n are real constants, prove that the equation 
C 0 + C,jt+ + + 0 

has at least one real root between 0 and 1 . 

5. Suppose /is defined and differentiable for every * >0, and f'( x) -+ 0 as * + oo. 

Put g(x) = f(x -F 1) — fix). Prove that gix) -* 0 as -> + oo. 

6. Suppose 

(^) / is continuous for a: >0, 
ib) f\x) exists f or x > 0, 

(c)/(0)=0, 

id) /' is monotonically increasing. 

Put 



and prove that g is monotonicaPy increasing. 

7. Suppose /'OO, g\ x) exist, g \x) ^ 0, and fix) = gix) = 0. Prove that 


lim^ 

r-x git ) 


/'to 

g'to 


(This holds also for complex functions.) 

8. Suppose /' is continuous on [ a , b ] and £ >0. Prove that there exists 8 > 0 such 
that 


fit) -fix) 
t — x 


- fix) 


< e 
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whenever 0 < | / — x\ <8, a <x <b, a <t <>b. (This could be expressed by 
saying that /is uniformly differentiable on [ a , b\ if f is continuous on [ a , b].) Does 
this hold for vector-valued functions too? 

9. Let / be a continuous real function on R\ of which it is known that f\x) exists 
for all x ^ 0 and that fix) -► 3 as x -► 0. Does it follow that /'( 0) exists? 

10. Suppose /and g are complex differentiable functions on (0, l),/(.v) -> 0, g(x) -► 0, 
/'(x) -► A , #'(*) £ as x 0, where A and B are complex numbers, B ^ 0. Prove 

that 


x-+og(x) B 


Compare with Example 5.18. Hint: 


fix) 

g{x) 




9ix) 


-hA 


x 

9{xY 


Apply Theorem 5.13 to the real and imaginary parts of f(x)/x and g(x)/x. 

11. Suppose /is defined in a neighborhood of x, and suppose f"(x) exists. Show that 


r f(x + h)+f(x-h)-2f(x) 
lim — 


Show by an example that the limit may exist even if f"(x) does not. 

Hint: Use Theorem 5.13. 

12. If fix) = \x \ \ compute f'{x), fix) for all real .v, and show that / (3) (0) does not 
exist. 

13. Suppose a and c are real numbers, e >0, and /is defined on [—1, 1] by 

/(v) = |^‘ sin,A ‘" C) 

Prove the following statements: 

(a) f is continuous if and only if a >0. 

(b) f'i 0) exists if and only if a > 1 . 

(c) f' is bounded if and only if a > 1 4- c. 

(d) /' is continuous if and only if a > 1 + c. 

ie) f"i 0) exists if and only if a > 2 4- c. 

if) f is bounded if and only if a ;> 2 + 2c. 

ig) f is continuous if and only if a > 2 + 2c. 

14. Let / be a differentiable real function defined in ia f b). Prove that / is convex if 
and only if /' is monotonically increasing. Assume next that fix) exists for 
every x e ia , b ), and prove that /is convex if and only if fix) > 0 for all x e ia , b). 

15. Suppose a e R\f is a twice-dififerentiable real function on ia, oo), and M 0 , M lt M 2 
are the least upper bounds of |/(jc)|, |/'(jt)|, \f"ix)\, respectively, on ia, oo). 
Prove that 


(if * * 0), 
(if jc = 0). 


Mf < AM 0 M 2 . 
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Hint: If h > 0, Taylor’s theorem shows that 

1 


/'(*) = - h [f(x + 2 h) -/«] - hf'U) 


for some f e (jc, x + 2h). Hence 


\f\x)\ < hM 2 + 


Mo 


To show that M? = 4M 0 M 2 can actually happen, take a = — 1, define 


( 2x 2 — 1 

x 2 - 1 
x 2 + 1 


(-1 < a: < 0), 

(0 < JC < 00), 


and show that M 0 = 1 , M x = 4, M 2 = 4. 

Does A/? <4M 0 M 2 hold for vector-valued functions too? 

16. Suppose /is twice-differentiable on (0, oo ), /" is bounded on (0, oc), and f(x) -> 0 
as x ^ oo. Prove that /'( jc) -^0 as x -> x. 

////if: Let a -> oo in Exercise 15. 

17. Suppose /is a real, three times differentiable function c.i [— 1, 1], such that 

/(— 1) = 0, /(0) = 0 , /(l) = 1, /'(O) = 0 . 

Prove that f (2 \x) > 3 for some x e ( — 1, 1). 

Note that equality holds for K* 3 4- x 2 ). 

Hint: Use Theorem 5.15, with a = 0 and j8 = ± 1, to show that there exist 
5 e (0, 1) and t e (— 1, 0) such that 

/ <3) W+/ <3) (/) = 6. 

18. Suppose /is a real function on [tf, />], w is a positive integer, and exists for 

every t e [ a , £]. Let a, j8, and P be as in Taylor’s theorem (5.15). Define 

m-m 


Q{t) 




for t e [ a , b], t ^ f3, differentiate 

f(t)-f(P) = (t-p)Q(t) 

n — 1 times at t = a, and derive the following version of Taylor’s theorem: 

O {n ~ l) (oc) 

/(f})=P(P) + ¥—±l(P- a y. 

19. Suppose / is defined in (—1,1) and /'( 0) exists. Suppose — 1 < a. <ft < 1, 
a. -*■ 0, and ft -»• 0 as n -> oo. Define the difference quotients 

/(ft) -/(«„) 


A, 


ft — a. 
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Prove the following statements: 

(а) If <x n < 0 < , then lim D„ = /'( 0). 

(б) If 0 < a„ < and {/3„/(/3„ — a„)} is bounded, then lim D„ = f'{ 0). 

(c) If /' is continuous in (—1, 1), then lim D„ = /'( 0). 

Give an example in which /is differentiable in (—1, 1) (but /' is not contin- 
uous at 0) and in which a„ , tend to 0 in such a way that lim D„ exists but is differ- 
ent from /'( 0). 

20. Formulate and prove an inequality which follows from Taylor’s theorem and 
which remains valid for vector-valued functions. 

21. Let E be a closed subset of R l . We saw in Exercise 22, Chap. 4, that there is a 
real continuous function /on R 1 whose zero set is E. Is it possible, for each closed 
set E , to find such an / which is differentiable on R\ or one which is n times 
differentiable, or even one which has derivatives of all orders on R' ? 

22. Suppose /is a real function on (— x, gc). Call x a fixed point of /if f(x) = x. 

(а) If /is differentiable and f'(t) ^ 1 for every real /, prove that /has at most one 
fixed point. 

(б) Show that the function /defined by 

/(/) = / + ( 1 - he ')" 1 

has no fixed point, although 0 < fV) < 1 for all real t. 

(c) However, if there is a constant A < 1 such that \f\t)\ < A for all real f, prove 
that a fixed point a' of / exists, and that .v = lim where Xi is an arbitrary real 
number and 


*„ + , =f(x n ) 


for n = 1, 2, 3, ... . 

(d) Show that the process described in (c) can be visualized by the zig-zag path 

(*i , X 2 ) -> (*2, * 2 ) ( X 2 , X 3 ) -> (X 3 , Xi) (Xi , x 4 ) * • * 

23. The function /defined by 


f(x) = 


a- 3 + 1 
3 


has three fixed points, say a, /3, y, where 

— 2<a<— 1, 0 < jS < 1, 1 < y < 2. 

For arbitrarily chosen x it define {x„} by setting x n + l = f(x„). 

(a) If Xi < a, prove that x lt — oo as n -* oo. 

(b) If a < xi < y, prove that x„ -> as /i -> co . 

(c) If y < Xi, prove that -> -h oo as n -> oo. 

Thus can be located by this method, but a and y cannot. 



118 PRINCIPLES OF MATHEMATICAL ANALYSIS 


24 . The process described in part (c) of Exercise 22 can of course also be applied to 
functions that map (0, oo) to (0, oo). 

Fix some a > 1, and put 

« = !(* + “)> = 

Both / and g have Va as their only fixed point in (0, oo). Try to explain, on the 
basis of properties of / and g , why the convergence in Exercise 16, Chap. 3, is so 
much more rapid than it is in Exercise 17. (Compare /' and g ', draw the zig-zags 
suggested in Exercise 22.) 

Do the same when 0 < a < 1. 

25 . Suppose / is twice differentiable on [ a , b], f(a) < 0, f(b) > 0, f\x) > 8 > 0, and 
0 </"(*) < M for all x e [ a , b]. Let £ be the unique point in ( a , b) at which 

m= o. 

Complete the details in the following outline of Newton s method for com- 
puting £. 

(a) Choose x^ e (£, 6), and define {*„} by 


X n 


f(Xn) 


Interpret this geometrically, in terms of a tangent to the graph of /. 
(b) Prove that x„ + l < x n and that 

lim x„ = £. 


(c) Use Taylor’s theorem to show that 


X„+i 


2 /'(*.) 


( x„ 


£) 2 


for some t„ e (^, at„). 

(d) If A = Af/28, deduce that 

0<*„ + 1 — [A(xi — £)] 2 \ 

A 


(Compare with Exercises 16 and 18, Chap. 3.) 

( e ) Show that Newton’s method amounts to finding a fixed point of the function 
g defined by 


g(x) = x — 


fix) 

fix)' 


How does g '(x) behave for x near £? 

(/) Put f(x) = x 113 on (— oo, oo) and try Newton’s method. What happens? 
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226. Suppose / is differentiable on [a, 6 ], f(a ) = 0, and there is a real number A such 
that j/'(.v)| < A |/(a) | on [ a , b). Prove that /(a) = 0 for all a e [a, b]. Hint : Fix 
.Vo e [a, hi let 

Mo = sup | /(a) |, Mi =sup|/'(*)| 

for a <x < A*o . For any such a, 

I/O) | < M \ (Ao — a) < A(x 0 — a)M 0 . 

Hence M 0 = 0 if A (x 0 — a) < 1. That is, /= 0 on [a, a 0 ]. Proceed. 

2 27. Let be a real function defined on a rectangle /? in the plane, given by a <x <,b, 
a < v < ft. A solution of the initial-value problem 

y* = <£ 0 , y)> y(°) = c O <c<P) 

is, by definition, a differentiable function /on [a, b] such that /(tf) = c, a < /(a) ^ 
and 

/'(a) = 0(A, /(*)) O < A < 6 ). 

Prove that such a problem has at most one solution if there is a constant A such 
that 

\<t>{x,y!) - <£ 0 ,.yi)| </lb 2 ->'i| 
whenever (a, yi) e /? and (a, .y 2 ) e R. 

Hint: Apply Exercise 26 to the difference of two solutions. Note that this 
uniqueness theorem does not hold for the initial-value problem 

y'=y' /2 , y( 0 ) = 0 , 

which has two solutions: /(a) — 0 and /(a) = a 2 /4. Find all other solutions. 

228. Formulate and prove an analogous uniqueness theorem for systems of differential 
equations of the form 

yj =<t>j(x,y t , ... ,y*), yAa)=cj (/ = l, , k). 

Note that this can be rewritten in the form 

y' = 4 >(a, y), y(a) = c 

where y = (y u . . . , y k ) ranges over a A:-cell, <t> is the mapping of a (k + l)-cell 
into the Euclidean &-space whose components are the functions <f>i, . . . , <f> ki and c 
is the vector (r, , . . . , c k ). Use Exercise 26, for vector-valued functions. 

2 29. Specialize Exercise 28 by considering the system 

y'j=yj + 1 (j = 1 , ••• , k— 1 ), 

y’k=f(x)- £ gj(x)yj , 

j = i 

where /, gu . . . , gk are continuous real functions on [ a , b\ and derive a uniqueness 
theorem for solutions of the equation 

y {k) + 0 k (A )/ k " 1 > H + £ 2 (a)/ + gi(x)y =/( a), 

subject to initial conditions 

yO) = c i, y'(a) = c 2 , ...» y (k ~ l) (a) = c k . 



6 

THE RIEMANN-STIELTJES INTEGRAL 


The present chapter is based on a definition of the Riemann integral which 
depends very explicitly on the order structure of the real line. Accordingly, 
we begin by discussing integration of real-valued functions on intervals. Ex- 
tensions to complex- and vector-valued functions on intervals follow in later 
sections. Integration over sets other than intervals is discussed in Chaps. 10 
and 1 1 . 


DEFINITION AND EXISTENCE OF THE INTEGRAL 

6.1 Definition Let [ a , b] be a given interval. By a partition P of [ a , b] we 
mean a finite set of points x 0i x u . . . , x n , where 

a = x 0 ^ x x ^ * * • < x n _ t < x n = b. 


We write 


Axj = x t — x 


(i = 1 /f). 
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Now suppose / is a bounded real function defined on [< a , b], Corresponding to 
each partition P of [< a , b ] we put 

A/ f = sup/(x) (x,_, < x < x,), 

w, = inf/(x) (*,-i < x < x,), 

U(P,f)=tM i Ax i , 

i = 1 

/.(/»,/) = X W, Ax, , 

» = 1 

and finally 

(1) TV A = inf U{PJ\ 

J a 

(2) f f dx = sup L(P,f), 

t_a 

where the inf and the sup are taken over all partitions P of [a, b]. The left 
members of (1) and (2) are called the upper and lower Riemann integrals of / 
over [ a , b ], respectively. 

If the upper and lower integrals are equal, we say that / is Riemann - 
integrable on [a, b], we write f e 01 (that is, 8# denotes the set of Riemann- 
integrable functions), and we denote the common value of (1) and (2) by 

( 3 ) Ijdx, 

or by 

(4) f /(x) dx. 

J a 

This is the Riemann integral of / over [ a , b]. Since / is bounded, there 
exist two numbers, m and M , such that 

m < f(x) < M (a < x < b). 

Hence, for every P, 

m(b -a)< L(P 9 f) < U(PJ) < M(b - a ), 

so that the numbers ^(P,/) and U(P,f) form a bounded set. This shows that 
the upper and lower integrals are defined for every bounded function /. The 
question of their equality, and hence the question of the integrability of fi is a 
more delicate one. Instead of investigating it separately for the Riemann integral, 
we shall immediately consider a more general situation. 
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6.2 Definition Let a be a monotonically increasing function on [a, b] (since 
a(tf) and a(6) are finite, it follows that a is bounded on [a, b]). Corresponding to 
each partition P of [ a , b], we write 

Aa i = a (x/) - 

It is clear that Aa f > 0. For any real function / which is bounded on [a, b] 
we put 

U(PJ, «)= i M, Aa,-, 

i= 1 

L(P ,/, a) = £ m t Aa ; , 

I = 1 

where Af,, a??,- have the same meaning as in Definition 6.1, and we define 


(5) 

f fda. = inf U(P,f a), 

J a 

(6) 

f fda. = sup L(P,f a), 

t_a 


the inf and sup again being taken over all partitions. 

If the left members of (5) and (6) are equal, we denote their common 
value by 

(7) C fda 

J a 

or sometimes by 

( 8 ) [ f{x) dc/.(x). 

J a 

This is the Riemarm-Stieltjes integral (or simply the Stieltjes integral) of 
/ with respect to a, over [ a , b\. 

If (7) exists, i.e., if (5) and (6) are equal, we say that /is integrable with 
respect to a, in the Riemann sense, and write / e ^(a). 

By taking a(x) = x, the Riemann integral is seen to be a special case of 
the Riemann-Stieltjes integral. Let us mention explicitly, however, that in the 
general case a need not even be continuous. 

A few words should be said about the notation. We prefer (7) to (8), since 
the letter x which appears in (8) adds nothing to the content of (7). It is im- 
material which letter we use to represent the so-called “variable of integration.” 
For instance, (8) is the same as 


/ /O') My)- 
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The integral depends on / a, a and b, but not on the variable of integration, 
which may as well be omitted. 

The role played by the variable of integration is quite analogous to that 
of the index of summation: The two symbols 

ic„ ic k 

i= 1 k= 1 

are the same, since each means c x + c 2 4- • • * 4- c n . 

Of course, no harm is done by inserting the variable of integration, and 
in many cases it is actually convenient to do so. 

We shall now investigate the existence of the integral (7). Without saying 
so every time,/ will be assumed real and bounded, and a monotonically increas- 
ing on [a, b]; and, when there can be no misunderstanding, we shall write J in 

place of 

J a 

6.3 Definition We say that the partition P* is a refinement of P if P* =>P 
(that is, if every point of P is a point of P*). Given two partitions, P x and P 2 , 
we say that P * is their common refinement if P* = P { u P 2 . 


6.4 Theorem If P* is a refinement of P , then 

(9) P(P,/,a)<L(P*,/,a) 
and 

(10) U(P* 9 f, a) < U(P,f a). 

Proof To prove (9), suppose first that P* contains just one point more 
than P. Let this extra point be x*, and suppose x i _ 1 < x * <x i9 where 
jc f _ j and Xi are two consecutive points of P. Put 

Wj = inf f(x) (*i-i < x < x *), 

w 2 = inf f(x) (x* < x < Xi). 

Clearly > m t and w 2 >m i9 where, as before, 


Hence 


Mi = in ff(x) (x,-! < x < x^. 


L(P*,f a) - P(P,/, a) 

= wJaC**) - a(x,_,)] + w 2 [a(x.) - a(x*)] - m,[ <x(x t ) - 
= (wj - w,)[a(x*) - a(x f _j)] + (w 2 - mJWxj) - a(x*)] S: 0. 

If P* contains k points more than P, we repeat this reasoning k 
times, and arrive at (9). The proof of (10) is analogous. 
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6.5 Theorem f f da < ( f da. 

J a J a 

Proof Let P* be the common refinement of two partitions P x and P 2 . 
By Theorem 6.4, 

UPff, «) < a) < U(p* 9 f, a) < U(P 2 ,/, a). 

Hence 


(11) L(P i9 f, a) < U(P 2 ,/, a). 

If P 2 is fixed and the sup is taken over all P l9 (11) gives 

(12) ffdu*U(P 2 ,f,a). 

The theorem follows by taking the inf over all P 2 in (12). 

6.6 Theorem / e ^2(a) on [ a , b ] if and only if for every e > 0 there exists a 
partition P such that 

(13) U(P 9 f 9 a) — L(P,f a) < e. 

Proof For every P we have 

L(P,f a) < jf dec < ~jfda < U(P 9 f a). 

Thus (13) implies 

0 < j f da — jf da < e. 

Hence, if (13) can be satisfied for every e > 0, we have 

~jf da = ffda, 

that is, / 6 0?(a). 

Conversely, suppose / e ^2(a), and let e > 0 be given. Then there 
exist partitions P x and P 2 such that 


(14) 

U(P 2 ,/,«)-/ fd0L < 

(15) 

f /da - L(P t ,f, cc) < t 
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We choose P to be the common refinement of P x and P 2 . Then Theorem 
6.4, together with (14) and (15), shows that 

U(P,f, a) < U(P 2 ,/ a) < jfdu + 1 < L(P u f, a) + e< L(P,f, a) + e, 

so that (13) holds for this partition P. 

Theorem 6.6 furnishes a convenient criterion for integrability. Before we 
apply it, we state some closely related facts. 

6.7 Theorem 

(a) If ( 13) holds for some P and some e, then (13) holds (with the same e) 
for every refinement of P. 

(b) T/' (13) holds for P = (jc 0 and if s { , t x are arbitrary points in 

[Xj- 1 , x,], then 

X l/Oi) -/(',) I Aa ; < e. 

i = 1 

(c) Iffe @(x) and the hypotheses of (b) hold , then 

t /('.•) Aa f - f f da I < e. 

i=l J a I 

Proof Theorem 6.4 implies (a). Under the assumptions made in (b), 
both f(Si ) and /(*,) lie in [m i9 AfJ, so that |/(.y f ) — f(ti)\ < M x — m f . Thus 

i I f(sd -Ah) I Aa f < U(P,f a) - L(P,f a), 

i= 1 

which proves (6). The obvious inequalities 

L(P,L a) £/(/,) Aa f < (/(/>,/,«) 
and 

L(P,f a) < J/</a < U(P,f a) 

prove (c). 

6.8 Theorem Iff is continuous on [ a , 6] then f e ^(a) oa? [a, b]. 

Proof Let e > 0 be given. Choose r\> 0 so that 

[a(Z>) - a(a)fo < e. 

Since /is uniformly continuous on [a, b] (Theorem 4.19), there exists a 
<5 > 0 such that 


( 16 ) 


!/(*)-/(')! <n 
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i I x e [a, b], t e [ a , b], and \x — t\ < S. 

If P is any partition of [a, b ] such that A.y, < <5 for all /, then (16) 
implies that 

(17) M i — < rj (« — 1 , — , /i) 

and therefore 

U(PJ, a) - L(P,/, a) = Y (A/ - /«,) Aa f 

i= 1 
n 

<nY. Aa i = '/[«(*) - a ( fl )l < « • 

i = 1 

By Theorem 6.6,/ e ^(oc). 


6.9 Theorem If f is monotonic on [a, b\ y and if a /j continuous on [a, b], then 
f e 0t(v). (We still assume , o/ course , that a /j monotonic.) 


Proof Let £ > 0 be given. For any positive integer a?, choose a partition 
such that 


A* i = 


ccjb) - «(a) 
n 


('= L 


This is possible since a is continuous (Theorem 4.23). 

We suppose that /is monotonically increasing (the proof is analogous 
in the other case). Then 


M t =/(*,), m, =/(*,_,) (/ = 1, . . . , n), 

so that 

[/(P,/, a) - L(PJ, a) = — — — f [/(*,) -/(Jf.--i)J 

n i 


- a(ft) 
n 


* [/(*) -/(*)] < e 


if a? is taken large enough. By Theorem 6.6 ,/ e (%(&). 


6.10 Theorem Suppose f is bounded on [ a , b], f has only finitely many points 
of discontinuity on [a, b ], and a is continuous at every point at which f is discon- 
tinuous. Then f e @(ol). 

Proof Let e > 0 be given. Put M = sup |/(jc) | , let E be the set of points 
at which /is discontinuous. Since E is finite and a is continuous at every 
point of E, we can cover E by finitely many disjoint intervals [uj ,, vfi c= 
[a, b] such that the sum of the corresponding differences ol(v { ) — ot(Uj) is 
less than e. Furthermore, we can place these intervals in such a way that 
every point of E n ( a , b) lies in the interior of some [u j9 Vj]. 
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Remove the segments {u jy vj from [< a , b], The remaining set K is 
compact. Hence / is uniformly continuous on AT, and there exists 3 > 0 
such that |/(5) — /(/) | < £ if s e K, t e AT, \s — /| <3. 

Now form a partition P = (x 0 , x l9 . . *„} of [a, b ], as follows: 
Each Uj occurs in P. Each Vj occurs in P. No point of any segment (u j9 vj) 
occurs in P. If x i _ l is not one of the Uj , then Ax t < 3. 

Note that A/, — m x < 2 M for every z, and that Af* — < e unless 

is one of the u } . Hence, as in the proof of Theorem 6.8, 

U(PJ 9 a) - L{PJ\ a) < [a(6) - a(tf)]e 4- 2Me. 

Since e is arbitrary, Theorem 6.6 shows that / g J?(a). 

If / and a have a common point of discontinuity, then / need not 
be in ^(a). Exercise 3 shows this. 

6.11 Theorem Suppose a) ozz [a, 6], m<f<M , (f) is continuous on 

[nu A/], and /z(.v) = </>(/( a*)) 0 >z [tf, 6]. 77zc/z /z g o/z [<7, 6]. 

Proof Choose e > 0. Since <f> is uniformly continuous on [w, Af], there 
exists (5 > 0 such that <5 < c and | </>(.?) -</>(/) | < e if |j — <6 and 

s, t e [m, A/]. 

Since / g M(y) 9 there is a partition P = {x 0 , a*i, . . . , .v„} of [ a , 6] such 

that 

(18) U(P,f 9 a) — L(P,f, a) < <5 2 . 

Let A/,-. m t have the same meaning as in Definition 6.1, and let A/*, /wf 
be the analogous numbers for /z. Divide the numbers 1, . . . , n into two 
classes: i e A if A/, — m i <3, i e B if M x — m { > 3. 

For i e A 9 our choice of 3 shows that Mf — mf < e. 

For i e B, Mf — mf < 2K 9 where K = sup | </>(/ ) | , m < t < M. By 
(18), we have 


(19) d £ Aa ; < £ (Mi - m) Aoc, < S 2 

i eB ie B 

so that XieB Aoq < <5. It follows that 

U(Pf /z, a) - L(Pf hf a) = £ (Af * - /wf) Aa, -f- £ (A/* — mf) Aa, 

1 e A / e B 

< e[oc(b) - <x(a)] 4- 2AT(5 < e[oc(6) - a(a) 4- 2*]. 

Since e was arbitrary. Theorem 6.6 implies that /z g ^2(a). 

Remark: This theorem suggests the question: Just what functions are 
Riemann-integrable? The answer is given by Theorem 11.33(6). 
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PROPERTIES OF THE INTEGRAL 


6.12 Theorem 

(a) Iffi e &(<x) and f 2 e &(<x) on [a, b], then 

f\ &(<*)> 

cf e 0t(oi) for every constant c, and 


£ (/i + fi) da = £/i da + £ f 2 da, 

f cfda = c C f da. 

J a 


(b) If fi(x) <f 2 (x) on [a, b], then 

f /i da <. f f 2 da. 

da 

(c) If f e &(oc) on [a, b ] and if a < c < b, then f e £%(<x) on [a, c] and on 
[c, b ], and 

f fd* + [7 da = f fdoi. 

Ja Jc Ja 

(d) Iffe £%(%) on [a, b] and if |/(x)| < M on [a, b ], then 


['/da 

Ja 


< M[a(b) - a(a)]. 


(e) If fe 3t( a L ) and f e 3l( a 2 ), then f e 31 (a t + a 2 ) and 


J a f d (a 2 + a 2 ) = £ fda 2 + £ fda 2 ; 
positive constant , then 
f f d(coc) = c f f doi. 

J a J a 


if f e $(a) and c is a positive constant , then f e ffl(coi) and 


Proof If / = f + f 2 and P is any partition of [a, b], we have 
(20) L(P,f, a) + L(P,f 2 , a) < L(P,f a) 

< U(P,f a) < U(P,f, a) + U(P,f 2 , a). 

If /j e 31(a) and f 2 e 31(a), let e > 0 be given. There are partitions P } 
0 = 1,2) such that 


U(Pj,f J ,a)-L(P J J j ,a)<e. 
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These inequalities persist if P x and P 2 are replaced by their common 
refinement P. Then (20) implies 

U(P 9 f 9 *)-L(P 9 f 9 *)<2e 9 

which proves that f e 

With this same P we have 

U(P,fj,a) < $fjda + e 0=1,2); 

hence (20) implies 

if dot < U(P,f, a) < | /, da + J/ 2 da + 2s. 

Since e was arbitrary, we conclude that 

(21) If da < J/i da + J/ 2 da. 

If we replace f x and f 2 in (21) by — f v and —f 2 , the inequality is 
reversed, and the equality is proved. 

The proofs of the other assertions of Theorem 6.12 are so similar 
that we omit the details. In part ( c ) the point is that (by passing to refine- 
ments) we may restrict ourselves to partitions which contain the point c, 
in approximating J f den. 


6.13 Theorem If f e and g e ${<x) on [a, b], then 

(a) fg e @(a); 

(b) |/| e 3t(a) and f f da < f \f\ da. 

da da 

Proof if we take </>(/) = t 2 , Theorem 6.1 1 shows that/ 2 e ^?(a) if/e ^(a). 
The identity 

4 fg = ( f+g ) 2 - (f-g) 2 


completes the proof of ( a ). 

If we take (f)(t) = \ t \ , Theorem 6.1 1 shows similarly that |/|.e^?(a). 
Choose c = ±1, so that 

c If dot > 0. 

Then 


| If da | = c\fda = J cf da < J |/| da. 


since cf < |/| . 


6.14 Definition The unit step function I is defined by 



(*< 0 ), 

(x > 0). 
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6.15 Theorem If a < s < b, f is bounded on [< a , b], f is continuous at s, and 
a(x) = J(x — s ), then 


[ / da =f(s). 

J a 

Proof Consider partitions P = {a- 0 , x l% x 2 , x 3 ), where x 0 = a , and 
x l = s < x 2 < x 3 = b. Then 

U(P,f a) = A/ 2 , L(P,f a) = m 2 . 

Since /is continuous at 5 , we see that M 2 and m 2 converge to f(s) as 
x 2 -► s. 


6.16 Theorem Suppose c n > 0 for 1, 2, 3, , I c n converges , {^ n } is a sequence 

of distinct points in ( a , b ), and 

(22) a(x) = X c„/(x - s„). 

n= 1 

Lef / /ie continuous on [a, b]. Then 


(23) 


J /<fc = Z C n /(•*.)• 


Proof The comparison test shows that the series (22) converges for 
every x. Its sum ct(x) is evidently monotonic, and a(tf) = 0, cc(b) = I c„. 
(This is the type of function that occurred in Remark 4.31.) 

Let e > 0 be given, and choose N so that 


!<■„<£. 


N+ 1 


Put 


= X C„l(x - s„), Ct 2 (x) = X c„I(x - s„). 

n — 1 N + 1 


By Theorems 6.12 and 6.15, 


(24) 


(25) 


[ fdd 1 = X C nf( S n)- 
J a i = 1 


Since oc 2 (b) — a 2 (a) < £, 


r b 

[ fdtx 2 

J a 


< Me, 
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where M = sup'|/(x)|: Since a = + a 2 , it follows from (24) and (25) 

that 

(26) C fda. - £ c„f(s„) < Me. 

J a i = l 

If we let TV -> oo, we obtain (23). 


6.17 Theorem Assume a increases monotonically and a' e on [ a , b]. Let f 
be a bounded real function on [a, b]. 

Then f e 0t(d) if and only if fat! e 0t. In that case 

(27) f b f dot = f b f(x)a\x)dx. 

J a J a 

Proof Let e > 0 be given and apply Theorem 6.6 to a': There is a par- 
tition P = {x 0 , . . . , x,,} of [a, b] such that 

(28) U(P, a') - L(P, a') < e. 

The mean value theorem furnishes points t t e [*,•_!, x,] such that 
Aa i = oc'(ti) A Xi 

for / = 1, . . . , n. If Si e [*,_!, x f ], then 

(29) £ |a'( J i) - a'(<i)| Ax,- < e, 

i = 1 

by (28) and Theorem 6.1(b). Put M = sup|/(x) | . Since 
£ f(Si) Aa,- = £ f(s,)ot'(t,) Axj 

i = 1 i - 1 

it follows from (29) that 

(30) £ f(s t ) Aa,- - £ f(s ,-)a'(i ,-) Ax f < Me. 

i= 1 i= 1 

In particular, 

£ f(s,) Aa,- < U(P,fcc’) + Me, 

i - 1 

for all choices of s ( e [x,_ l5 x,], so that 

U(P,f a) < U(PJa') 4- Me. 

The same argument leads from (30) to 


Thus 


U(PJx') < U(P ,/, a) + Me. 

| t/(P,/, a) - U(P,fu')\ < Me. 


(31) 
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Now note that (28) remains true if P is replaced by any refinement. 
Hence (31) also remains true. We conclude that 


Tfda-( b f(x)a\x)dx 

J a J a 


< Me. 


But £ is arbitrary. Hence 


(32) 


[ fda = f f(x)a.\x) dx, 

J a J a 


for any bounded /. The equality of the lower integrals follows from (30) 
in exactly the same way. The theorem follows. 


6.18 Remark The two preceding theorems illustrate the generality and 
flexibility which are inherent in the Stieltjes process of integration. If a is a pure 
step function [this is the name often given to functions of the form (22)], the 
integral reduces to a finite or infinite series. If a has an integrable derivative, 
the integral reduces to an ordinary Riemann integral. This makes it possible 
in many cases to study series and integrals simultaneously, rather than separately. 

To illustrate this point, consider a physical example. The moment of 
inertia of a straight wire of unit length, about an axis through an endpoint, at 
right angles to the wire, is 

(33) f x 2 dm 

Jo 

where m(x) is the mass contained in the interval [0, x]. If the wire is regarded 
as having a continuous density p, that is, if m\x) = p(x\ then (33) turns into 

(34) f x 2 p(x)dx. 

Jo 

On the other hand, if the wire is composed of masses m { concentrated at 
points x ( , (33) becomes 

(35) 

i 

Thus (33) contains (34) and (35) as special cases, but it contains much 
more; for instance, the case in which m is continuous but not everywhere 
differentiable. 

6.19 Theorem (change of variable) Suppose cp is a strictly increasing continuous 
function that maps an interval [A, B] onto [a, b], Suppose a is monotonically 
increasing on [a, b] and f e on [a, b]. Define P and g on [A, B ] by 

(36) P(y) = a(<p(jO), g(y) =f(<p(y))- 
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Then g g &(Ji) and 

(37) f°gdp = f*/da. 

Proof To each partition P = {x 0 , . . . , x„} of [a, b ] corresponds a partition 
Q = {yo, of [A,B], so that Xi = cpiyi). All partitions of [ A , B ] 

are obtained in this way. Since the values taken by / on [x f _ l5 x t ] are 
exactly the same as those taken by g on y t \ 9 we see that 

(38) U(Q , g, P) = U(P ,/, a), L(C, < 7 , £) = L(P,/, a). 

Since / g ^2(a), P can be chosen so that both U(P,f a) and L(P,f a) 
are close to J /da. Hence (38), combined with Theorem 6.6, shows that 
g g @(P) and that (37) holds. This completes the proof. 

Let us note the following special case : 

Take a(x) = x. Then J 3 = (p. Assume (p' e on [ A , B]. If Theorem 
6.17 is applied to the left side of (37), we obtain 

(39) £ f(x) dx = f(<p(y))<p\y) dy. 

INTEGRATION AND DIFFERENTIATION 

We still confine ourselves to real functions in this section. We shall show that 
integration and differentiation are, in a certain sense, inverse operations. 

6.20 Theorem Let f e on [ a , b ]. For a < x < b, put 

F(x) = f 7(0 dt. 

J a 

Then F is continuous on [a, b\; furthermore, iff is continuous at a point x 0 of 
l a , b\ then F is differentiable at x 0 , and 

o) =/(* o). 

Proof Since /e^, / is bounded. Suppose \f(t)\ < M for a t < b. 
If a < x < y < b, then 

| F(y) - F(x) | = [7(0 dt < M(y - x), 

J X 

by Theorem 6.12(c) and (d). Given e > 0, we see that 

- Wl < e, 
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provided that \y — x\ < e/M. This proves continuity (and, in fact, 
uniform continuity) of F. 

Now suppose /is continuous at x 0 . Given e > 0, choose <5 > 0 such 

that 

1/(0 —/(•*<>) i < £ 

if 1 1 — x 0 1 < <5, and a < t <b. Hence, if 

x 0 — d<s<x 0 <t<x 0 + S and a < s < t < b, 
we have, by Theorem 6.12 {d), 

— — -/(*<>) = -f- f [f(u) -f(x 0 )\ du <£. 

t — S t — S Js 

It follows that F\x 0 ) = f(x o)- 

6.21 The fundamental theorem of calculus If f e on [ a , b] and if there is 
a differentiable function F on [a, b] such that F' =f, then 

[ /(•*) dx = F{b) - F(a). 

da 

Proof Let e > 0 be given. Choose a partition P = { x 0 , . . . , x n } of [ a , b] 
so that U(P,f) — L(P,f) < £• The mean value theorem furnishes points 
t t e [x ( _j, jc,] such that 

F(xd - Fix^i) =/(/,) Ax i 

for i = 1 , . . . , n. Thus 

i fUi) A Xl = F(b) - F(a). 

i= 1 

It now follows from Theorem 6.7(c) that 

F(b) - F(a) - f f(x) dx <e. 

da 

Since this holds for every e > 0, the proof is complete. 

6.22 Theorem (integration by parts) Suppose F and G are differentiable func- 
tions on [a, b], F' = f e and G' = g e &t. Then 

\ b /(x)g{x) dx = F{b)G(b) - F(a)G(a) - f/(x)G(x) dx. 

Proof Put H(x) = F(x)G(x) and apply Theorem 6.21 to H and its deriv- 
ative. Note that //' e by Theorem 6.13. 
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INTEGRATION OF VECTOR-VALUED FUNCTIONS 

6.23 Definition Let f u . . . ,f k be real functions on [ a , b ], and let f = (/i, . . . 9 f k ) 
be the corresponding mapping of [a, b ] into R k . If a increases monotonically 
on [ a , b] 9 to say that f e ^(a) means that fj e forj = 1, . . . , k. If this is the 
case, we define 

In other words, Jf den is the point in R k whose yth coordinate is [ fj doc. 

It is clear that parts (a), (c), and ( e ) of Theorem 6.12 are valid for these 
vector-valued integrals; we simply apply the earlier results to each coordinate. 
The same is true of Theorems 6.17, 6.20, and 6.21. To illustrate, we state the 
analogue of Theorem 6 . 21 . 


6.24 Theorem Iff and F wap [tf , 6] into R k , iff e on [ a , b] 9 and if F' = f, then 

[V) dt = F(b) - F (a). 

* a 

The analogue of Theorem 6.13(/?) offers some new features, however, at 
least in its proof. 


6.25 Theorem If f maps [a, b] into R k and if fe 0?(a) for some monotonically 
increasing function a on [a, b] 9 then |f| e ^(a), and 


(40) 



Proof If/i , . . . , f k are the components of f, then 
(41) |f| =(/?+■•• + A 2 ) 1/2 . 

By Theorem 6.1 1, each of the functions fl belongs to a); hence so does 
their sum. Since x 2 is a continuous function of x, Theorem 4.17 shows 
that the square-root function is continuous on [0, M] 9 for every real M. 
If we apply Theorem 6.11 once more, (41) shows that |f| e ^2(a). 

To prove (40), put y = ( y l9 . . . , y k ), where yj = j fj da. Then we have 
y = Jf dec , and 


|y | 2 = I >’, 2 = Ij J jfjdcc = j(Zy J f i )d<x. 

By the Schwarz inequality, 

|y| |f(0l (a < t <b); 


(42) 
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hence Theorem 6.12(6) implies 


(43) |y | 2 ^ |y| J |f| dz. 

If y = 0, (40) is trivial. If y # 0, division of (43) by |y| gives (40). 


RECTIFIABLE CURVES 

We conclude this chapter with a topic of geometric interest which provides an 
application of some of the preceding theory. The case k = 2 (i.e., the case of 
plane curves) is of considerable importance in the study of analytic functions 
of a complex variable. 

6.26 Definition A continuous mapping y of an interval [< a , b ] into R k is called 
a curve in R k . To emphasize the parameter interval [ a , 6], we may also say that 
y is a curve on [ a , b]. 

If y is one-to-one, y is called an arc. 

If y(a) = y(b), y is said to be a closed curve. 

It should be noted that we deiine a curve to be a mapping , not a point set. 
Of course, with each curve y in R k there is associated a subset of R k , namely 
the range of y, but different curves may have the same range. 

We associate to each partition P = {x 0 , . . . , x n } of [a y b] and to each 
curve y on [a 9 b] the number 

MP,v) = Z lv(*i)- r(Xi-i)l- 

i= 1 

The ith term in this sum is the distance (in R k ) between the points y(x i _ l ) and 
y(x t ). Hence A(P 9 y) is the length of a polygonal path with vertices at y(jc 0 ), 
y(xj), . . . , y(x n ), in this order. As our partition becomes finer and finer, this 
polygon approaches the range of y more and more closely. This makes it seem 
reasonable to define the length of y as 

A(y) = sup A CP, y), 

where the supremum is taken over all partitions of [a, b]. 

If A(y) < oo, we say that y is rectifiable. 

In certain cases, A(y) is given by a Riemann integral. We shall prove this 
for continuously differentiable curves, i.e., for curves y whose derivative y' is 
continuous. 
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6.27 Theorem If y' is continuous on [a, b ], then y is rectifiable , and 

A (?)=f |y'(OI<*. 

J a 

Proof If a < < x t < b, then 

|y(x.) - y(Xi_i)| = f ‘ y\t)dt < f ‘ \y'(t)\dt. 

J Xi- 1 J Xi- 1 

Hence 

A(P, y) < Jjy'(OI dt 

for every partition P of [a, b ]. Consequently, 

A(y) < [ |y'(OI dt. 

J a 

To prove the opposite inequality, let e > 0 be given. Since y' is 
uniformly continuous on [ a , b], there exists S > 0 such that 

I Y( s ) — y'(0 1 < e if | s — 1 1 < s. 

Let P = (x 0 , • • . , x n } be a partition of [< a , b\ with Ax t < S for all /. If 
< t < x iy it follows that 

1/(01 ^ I /(*■•) I + £■ 

Hence 


f ‘ | y'(t) \dt<\ y '(*,•) | Ax ; + £ Ax t 

J Xi- l 

= f ' [/(0 + /(•*.) - y'(0] dt +e Axi 

J Xi- i 

< f* y\t)dt + [‘ [y'(Xi) - y'(t)] dt + e Ax, 

d X: — l J Xl - t 


<lr (*,) - X*i-i)l +2e Ax,.. 
If we add these inequalities, we obtain 


C\y'(t)\dt<A(P,y) + 2e(b-a) 

J a 


Since e was arbitrary. 


< A(y) + 2 e(b - a). 


£ I /(0I dt < A(y). 


This completes the proof. 
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EXERCISES 

1. Suppose a increases on [ a , b ], a < x 0 <b y a is continuous at x 0 , fix 0 ) = 1, and 
/(*) = 0 if x ^ x 0 . Prove that / e ^2(a) and that f f dot = 0. 

2. Suppose /> 0, / is continuous on [a, 6], and J fix) dx = 0. Prove that/(x) = 0 

for all * e [ a , b]. (Compare this with Exercise 1.) 

3. Define three functions j8 1# , /3 3 as follows: £,(*) = 0 if * < 0, j3j(x) = 1 if at > 0 

for j — 1, 2, 3; and /3i(0) = 0, /3 2 (0) = 1, /3 3 (0) = £. Let /be a bounded function on 
[-1,1]. 

(a) Prove that / e if and only if A0+) = /( 0) and that then 

jfdp,=A 0). 

(b) State and prove a similar result for /3 2 . 

(c) Prove that / e ^(j3 3 ) if and only if / is continuous at 0. 
id) If / is continuous at 0 prove that 

| fdfit = J fdp 2 = jfdp 3 = /(0). 

4 . If /(a:) = 0 for all irrational x,f(x) = 1 for all rational x, prove that f $ @1 on[tf, b] 
for any a < b. 

5. Suppose / is a bounded real function on [ a , b], and f 2 e & on [a, b]. Does it 
follow that f g Does the answer change if we assume that / 3 e <#? 

6. Let P be the Cantor set constructed in Sec. 2.44. Let / be a bounded real function 
on [0, 1] which is continuous at every point outside P. Prove that / e on [0, 1]. 
Hint: P can be covered by finitely many segments whose total length can be made 
as small as desired. Proceed as in Theorem 6.10. 

7. Suppose / is a real function on (0, 1] and f e on [c, 1] for every c > 0. Define 

f f{x) dx = lim f fix) dx 
J 0 C -+0 J c 

if this limit exists (and is finite). 

ia) If/e on [0, 1], show that this definition of the integral agrees with the old 
one. 

ib) Construct a function / such that the above limit exists, although it fails to exist 
with | /| in place of /. 

8. Suppose / e on [ a , b] for every b > a where a is fixed. Define 

fix)dx = \\m 

J a b —* oo 

if this limit exists (and is finite). In that case, we say that the integral on the left 
converges. If it also converges after / has been replaced by |/|, it is said to con- 
verge absolutely. 
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Assume that f(x ) >0 and that / decreases monotonically on [1, oo). Prove 

that 


converges if and only if 



dx 


E/(«) 

n = 1 


converges. (This is the so-called “integral test” for convergence of series.) 

9. Show that integration by parts can sometimes be applied to the “improper” 
integrals defined in Exercises 7 and 8. (State appropriate hypotheses, formulate a 
theorem, and prove it.) For instance show that 

cos x , r 00 sin x . 

TT^'-J. ft?*' 

Show that one of these integrals converges absolutely , but that the other does not. 
10. Let p and q be positive real numbers such that 


- + - = 1 . 

p q 

Prove the following statements. 

(< a ) If u > 0 and v > 0, then 

u p v 9 
uv < 1 . 

p q 

Equality holds if and only if u p = v 9 . 

( b ) If / e ^(a), g e ^?(a), /> 0, g > 0, and 




da, 


then 


J fgd(x<\. 

(c) If / and g are complex functions in ^(a), then 

This is Holder's inequality. When p = q = 2 it is usually called the Schwarz 
inequality. (Note that Theorem 1.35 is a very special case of this.) 

(d) Show that Holder’s inequality is also true for the “improper” integrals de- 
scribed in Exercises 7 and 8. 
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11. Let a be a fixed increasing function on [a, b]. For u e ^(a), define 

Ma = |j \u\ 2 da. 

Suppose f,g,h g @(ol\ and prove the triangle inequality 
\\f-h\\ 2 <\\f-g\\ 2 + || g-h\\ 2 

as a consequence of the Schwarz inequality, as in the proof of Theorem 1.37. 

12. With the notations of Exercise 11, suppose / e and e > 0. Prove that 
there exists a continuous function g on [ a , b] such that ||/ — g\\ 2 < e. 

Hint: Let P = {x 0 , . . . , x„} be a suitable partition of [< a , b], define 



g( t) 


x t — t 


/(*,-,) 


+ 


t — Xi-t 
&Xi 


A*d 


if xi-i <t <xi. 

13. Define 


f(x) = j sin {t 2 )dt. 

{a) Prove that \f(x) | < l/x if x > 0. 

Hint: Put t 2 = u and integrate by parts, to show that f(x) is equal to 

cos (jc 2 ) cos [(a: + l) 2 ] r (x+ 1)2 cos u 

~2x 2(x H- 1) J x2 4 ^ du ' 

Replace cos u by — 1 . 

(b) Prove that 


2xf(x) = cos (jc 2 ) — cos [(a + l) 2 ] + r(x) 

where | r (jc) | < clx and c is a constant. 

(c) Find the upper and lower limits of xf(x), as oo. 

(d) Does sin (t 2 ) dt converge? 

14. Deal similarly with 



sin ( e r ) dt. 


Show that 

e x | f(x) | < 2 

and that 


e x f(x) = cos ( e x ) — e~ l cos ( e x + 1 ) + r(x\ 
where | r(x:) | < Ce ~ x , for some constant C. 



THE RIEMANN-STIELTJES INTEGRAL 141 


15. Suppose / is a real, continuously differentiable function on [a, b],f(a) = f(b) = 0, 
and 

jV(*) dx=l. 

Prove that 

[ xf(x)f'(x) dx = — i 

J a 

and that 

f lf'MV dx ■ f'X 2 P(x) dx > i. 

16. For 1 < s < oo, define 

« i 

«*)- z 

n= i if 

(This is Riemann’s zeta function, of great importance in the study of the distri- 
bution of prime numbers.) Prove that 

(a) C(s) = J dx 

and that 

where [x] denotes the greatest integer < x . 

Prove that the integral in ( b ) converges for all s > 0. 

Hint: To prove ( a ), compute the difference between the integral over [1, N] 
and the Nth partial sum of the series that defines £(j). 

17. Suppose a increases monotonically on [< a , b], g is continuous, and g(x) = G'(x) 
for a<x <b. Prove that 

J oc(x)g(x) dx = G(b)oc(b) — G(a)oc(a) — J G da.. 

Hint: Take g real, without loss of generality. Given P = {x 0 , x i9 . . . , 
choose t t e ( jc < _ i , *0 so that g(ti) A* ( = G(x t ) — G(xt-i). Show that 

Z <*(Xi)g(ti) A* ( = G(b)oc(b) — G(a)<x(a) — £ G(xi- X ) A ct t . 

i=i i=i 

18. Let yi, y 2 , y 3 be curves in the complex plane, defined on [0, 2n ] by 

yi(0 = e l \ yi{t) — e 2i \ y 3 (/) = e 2nit sln (1/,) . 

Show that these three curves have the same range, that y x and y 2 are rectifiable, 
that the length of y x is 27 r, that the length of y 2 is 47 t, and that y 3 is not rectifiable. 
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19 . Let yi be a curve in R k , defined on [ a , b]\ let <f> be a continuous 1-1 mapping of 
[c, </] onto [a, b], such that <f>(c) = a\ and define y 2 (.s) = yi{<t>(s))- Prove that y, is 
an arc, a closed curve, or a rectifiable curve if and only if the same is true of y,. 
Prove that y 2 and y x have the same length. 



SEQUENCES AND SERIES OF FUNCTIONS 


In the present chapter we confine our attention to complex-valued functions 
(including the real-valued ones, of course), although many of the theorems and 
proofs which follow extend without difficulty to vector-valued functions, and 
even to mappings into general metric spaces. We choose to stay within this 
simple framework in order to focus attention on the most important aspects of 
the problems that arise when limit processes are interchanged. 


DISCUSSION OF MAIN PROBLEM 


7.1 Definition Suppose {/„}, n= 1,2,3,..., is a sequence of functions 
defined on a set £, and suppose that the sequence of numbers {/„(*)} converges 
for every x e E. We can then define a function /by 


( 1 ) 


f(x) = lim/„(x) (x e E). 
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Under these circumstances we say that {/„} converges on E and that / is 
the limit % or the limit function , of{/„}. Sometimes we shall use a more descriptive 
terminology and shall say that “{/„} converges to / pointwise on E” if (1) holds. 
Similarly, if If n (x) converges for every x e E, and if we define 

(2) /(*)=!/,(*) (xeE), 

n — 1 

the function / is called the sum of the series If n . 

The main problem which arises is to determine whether important 
properties of functions are preserved under the limit operations (1) and (2). 
For instance, if the functions /„ are continuous, or differentiable, or integrabie, 
is the same true of the limit function ? What are the relations between /' and /', 
say, or between the integrals of f n and that of /? 

To say that /is continuous at x means 

lim /(f) =f(x). 

t~*X 

Hence, to ask whether the limit of a sequence of continuous functions is con- 
tinuous is the same as to ask whether 

(3) lim lim/ B (f) = lim lim/„(f), 

t~*x n~* oo n-*oo t-*x 

i.e., whether the order in which limit processes are carried out is immaterial. 
On the left side of (3), we first let n -> oo, then / -► x; on the right side, t -> x 
first, then n -► oo. 

We shall now show by means of several examples that limit processes 
cannot in general be interchanged without affecting the result. Afterward, we 
shall prove that under certain conditions the order in which limit operations 
are carried out is immaterial. 

Our first example, and the simplest one, concerns a “double sequence.” 

7.2 Example For m = 1 , 2, 3, . . . , n = 1, 2, 3, . . . , let 

m 

m + n 


Then, for every fixed n , 


lim V„ = 1» 

m-» oo 


lim lim s m n = 1. 


so that 

(4) 


»»-* oo oo 
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On the other hand, for every fixed w, 

lim V« = 0, 

n~* oo 

so that 

(5) lim lim s m n = 0. 

m-> oo n-> oo 


7.3 Example Let 

fn(x) = 

and consider 


(i + x 2 r 


(x real ; n = 0, 1 , 2, . . .), 


( 6 ) 


/(*) = ! /»(*) = I 


„=o(l + x 2 )" 


Since /„( 0) = 0, we have /( 0) = 0. For .v ^ 0, the last series in (6) is a convergent 
geometric series with sum 1 + x 2 (Theorem 3.26). Hence 


(7) 


m = 


0 (x = 0), 

1 + x 2 (x # 0), 


so that a convergent series of continuous functions may have a discontinuous 
sum. 


7.4 Example For m = 1 , 2, 3, , put 

f m (x) = lim (cos m\nx) 2n . 

n-* oo 


When m\x is an integer, f m {x) = 1. For all other values of x,f m {x) = 0. Now let 

/(x) = lim / m (x). 

m-*oo 

For irrational jc, / m (x) = 0 for every m \ hence f(x) = 0. For rational x , say 
x- = plq , where p and # are integers, we see that mix is an integer if m > q, so 
that f(x) = 1. Hence 


( 8 ) 


lim lim (cos m\nx) 2n = L 

m-* oo n-*ao \ * 


(x irrational), 
( x rational). 


We have thus obtained an everywhere discontinuous limit function, which 
is not Riemann-integrable (Exercise 4, Chap. 6). 
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7.5 Example Let 

(9) /„(*) = (x real, n = 1 , 2, 3, . . .), 

V* 

and 

/(*) = lim/„(x) = 0. 

n-> oo 

Then/'(x) = 0, and 

/»'(*) = x/« COS AT*, 

so that {/^} does not converge to/'. For instance, 

/•'(0) = yjn -*• + °o 

as at — ► oo, whereas /'(0) = 0. 


7.6 Example Let 

(10) /„(x) = « 2 x(l - x 2 r (0 < x < 1 , « = 1 , 2, 3, . . .). 

For 0 < x < 1 , we have 


lim/„(x) = 0, 

n-* oo 

by Theorem 3.20(d). Since /„(0) = 0, we see that 
(11) lim/„(x) = 0 (0 ^ x < 1 ). 

n~* oo 

A simple calculation shows that 


f x(l — x 2 ) n dx 

J n 


1 

2n + 2 


Thus, in spite of (1 1), 



n 

2n + + 


as AT — > OO. 


If, in (10), we replace n 2 by at, (11) still holds, but we now have 

r 1 AT 1 

lim /„(*) = lim - — — = - , 

n-» oo ^0 n-oo ZAT + Z Z 


whereas 


f lim/„(*) dx = 0. 
•^0 L n-> oo 
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Thus the limit of the integral need not be equal to the integral of the limit, 
even if both are finite. 

After these examples, which show what can go wrong if limit processes 
are interchanged carelessly, we now define a new mode of convergence, stronger 
than pointwise convergence as defined in Definition 7.1, which will enable us to 
arrive at positive results. 


UNIFORM CONVERGENCE 

7.7 Definition We say that a sequence of functions {/„},«= 1,2,3,..., 
converges uniformly on £ to a function /if for every e > 0 there is an integer N 
such that n> N implies 

(12) \f n (x) -f(x)\ <£ 
for all x e E. 

It is clear that every uniformly convergent sequence is pointwise con- 
vergent. Quite explicitly, the difference between the two concepts is this: If{/„) 
converges pointwise on £, then there exists a function / such that, for every 
e > 0, and for every x e E, there is an integer TV, depending on e and on x, such 
that (12) holds if n > TV; if{/„} converges uniformly on £, it is possible, for each 
e > 0, to find one integer N which will do for all x e E. 

We say that the series I f n (x) converges uniformly on £ if the sequence 
{.?„} of partial sums defined by 

Z fi( X ) = S «(X) 

i= 1 

converges uniformly on £. 

The Cauchy criterion for uniform convergence is as follows. 

7.8 Theorem The sequence of functions (/,}, defined on £, converges uniformly 
on E if and only if for every e > 0 there exists an integer N such that m> N, 
n > N, x e £ implies 

(13) \f,(x) —f m (x) | < e. 

Proof Suppose {/„} converges uniformly on £, and let / be the limit 
function. Then there is an integer N such that n > N, x e E implies 

\fn(x) -/M| <- £ , 

so that 

\fn(x) -fjx) | < \f„{x) -f(x) | + | f{x) -f m {x) | < e 
if n > N, m > TV, x e £. 
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Conversely, suppose the Cauchy condition holds. By Theorem 3.1 1, 
the sequence {/„(*)} converges, for every x, to a limit which we may call 
f(x). Thus the sequence {/„} converges on E , to /. We have to prove that 
the convergence is uniform. 

Let e > 0 be given, and choose N such that (13) holds. Fix n , and 
let m -> oo in (13). Since f m (x) -► /(x) as m -► oo, this gives 

(14) !/„(*)-/(*) I 

for every n^N and every x e E, which completes the proof. 

The following criterion is sometimes useful. 

7.9 Theorem Suppose 


lim/„W =/(*) (x e E). 


Put 


M n = sup \f n (x) -f(x) \. 

xeE 

Then f n -► f uniformly on E if and only if M n -> 0 as n -+ oo. 

Since this is an immediate consequence of Definition 7.7, we omit the 
details of the proof. 

For series, there is a very convenient test for uniform convergence, due to 
Weierstrass. 


7.10 Theorem Suppose (/„} is a sequence of functions defined on E, and suppose 
!/■(*) I (xeE,n= 1,2,3,...). 

Then If n converges uniformly on E if I converges. 

Note that the converse is not asserted (and is, in fact, not true). 

Proof If converges, then, for arbitrary e > 0, 

Z/i (x) <£Mi<e (xeE), 

i = i» i = n 

provided m and n are large enough. Uniform convergence now follows 
from Theorem 7.8. 
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UNIFORM CONVERGENCE AND CONTINUITY 

7.11 Theorem Suppose f„ — > / uniformly on a set E in a metric space. Let x be 
a limit point of E, and suppose that 

(15) lim/„(0 = A„ (n= 1,2,3,...). 

t~> X 

Then {A n } converges, and 

(16) lim/(0 = lim A„ . 

t~*x n- >oo 

In other words, the conclusion is that 

(17) lim lim/ B (f) = lim lim/ B (t). 

t~*x n-> oo n~>oo t~*x 

Proof Let e > 0 be given. By the uniform convergence of {/„}, there 
exists N such that n > N, m > N, t e E imply 

(is) <£. 

Letting / x in (18), we obtain 

\A n -A m \ <e 

for n > N, m > N , so that {A n } is a Cauchy sequence and therefore 
converges, say to A. 

Next, 

(19) |/(/) -A | < |/(0 -f&) I + I m - An i + \A„ - A \. 

We first choose n such that 

(20) 1/(0 -/»(') I 

for all t e E (this is possible by the uniform convergence), and such that 

(21) \A m -A\Zy 

Then, for this n , we choose a neighborhood V of x such that 

(22) \m-A.\Z*~ 
if t e V n E, t^x. 

Substituting the inequalities (20) to (22) into (19), we see that 
1/(0 - A \ <e, 

provided t e V n E, t=^x. This is equivalent to (16). 
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7.12 Theorem If {/„} is a sequence of continuous functions on E, and if f n -> / 
uniformly on E, then f is continuous on E. 

This very important result is an immediate corollary of Theorem 7. 1 1 . 

The converse is not true; that is, a sequence of continuous functions may 
converge to a continuous function, although the convergence is not uniform. 
Example 7.6 is of this kind (to see this, apply Theorem 7.9). But there is a case 
in which we can assert the converse. 

7.13 Theorem Suppose K is compact , and 

(i a ) {/„} is a sequence of continuous functions on K , 

( b ) (/„} converges pointwise to a continuous f unction f on K> 

(c) f n (x) >f n+ i(x)for all xeK,n = 1, 2, 3, ... . 

Then f n ->/ uniformly on K. 

Proof Put g n =f n —f Then g n is continuous, g n -* 0 pointwise, and 
g n >g n + 1 * We have to prove that g n -> 0 uniformly on K. 

Let e > 0 be given. Let K n be the set of all x e K with g n (x) > e. 
Since g n is continuous, K n is closed (Theorem 4.8), hence compact (Theorem 
2.35). Since g n > g n + n we have K n => K n + l . Fix x e K. Since g n (x) -»0, 
we see that x $ K n if n is sufficiently large. Thus x $ Q K n . In other words, 
P) K n is empty. Hence K N is empty for some N (Theorem 2.36). It follows 
that 0 < g n (x) < e for all x e K and for all n> N. This proves the theorem. 

Let us note that compactness is really needed here. For instance, if 

/»(*) = — T 7 (0 < x < 1 ; « = 1, 2, 3, . . .) 

nx + 1 

then f n (x) — ► 0 monotonically in (0, 1), but the convergence is not uniform. 

7.14 Definition If A" is a metric space, #( X ) will denote the set of all complex- 
valued, continuous, bounded functions with domain X. 

[Note that boundedness is redundant if X is compact (Theorem 4.15). 
Thus #( X ) consists of all complex continuous functions on A" if A" is compact.] 
We associate with each / e #( X ) its supremum norm 

ll/ll = sup \f(x)\. 

X 6 X 

Since / is assumed to be bounded, ||/|| < oo. It is obvious that \\f\\ = 0 only if 
f(x) = 0 for every x e X, that is, only if /= 0. If h =f+ g, then 

I AM I ^ I/Ml + I^MU ll/ll + M 

for all * e A'; hence 

U+g^m + M- 
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If we define the distance between f e^(X) and g e&(X) to be \\f— g\\, 
it follows that Axioms 2.15 for a metric are satisfied. 

We have thus made &(X) into a metric space. 

Theorem 7.9 can be rephrased as follows: 

A sequence {/„} converges to f with respect to the metric of &(X) if and 
only if f n — ► f uniformly on X. 

Accordingly, closed subsets of #( X ) are sometimes called uniformly 
closed , the closure of a set s/ cz &(X) is called its uniform closure , and so on. 

7.15 Theorem The above metric makes &(X) into a complete metric space. 

Proof Let {/„} be a Cauchy sequence in #( X ). This means that to each 
e > 0 corresponds an N such that || f n — f m \\ < e if n > N and m > N. 
It follows (by Theorem 7.8) that there is a function / with domain X to 
which {/„} converges uniformly. By Theorem 7.12, / is continuous. 
Moreover, / is bounded, since there is an n such that |/(x) — f n (x) \ < 1 
for all x e X, and /„ is bounded. 

Thus / e f £(X), and since f n -*f uniformly on X , we have 
\\f-fn\\ ->0 as n-+ oo. 
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7.16 Theorem Let a be monotonically increasing on [< a , b\. Suppose f n e 0t(ct) 
on [ a , b], for n = 1 , 2, 3, ... , and suppose f n ->/ uniformly on [a, b]. Then f e a) 
on [a, b], and 


(23) 




(The existence of the limit is part of the conclusion.) 


Proof It suffices to prove this for real f n . Put 

(24) £„ = sup Ifjx) -f(x) |, 
the supremum being taken over a <x <b. Then 

f n ~ £„ <f <f n + £„ , 

so that the upper and lower integrals of / (see Definition 6.2) satisfy 

(25) J '\f n ~ O dx <jfda <j/da^ j\/„ + e n ) da. 


0 < j f da - j f da < 2e„[a (b) - a(a)]. 


Hence 
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Since e„->0asw->oo (Theorem 7.9), the upper and lower integrals of / 
are equal. 

Thus f e Another application of (25) now yields 

(26) f” f da - f V. da < e m [*{b) - a(a)]. 

J a J a 

This implies (23). 

Corollary If /„ e @(oi) on [a, b ] and if 

/(*) = f. fn(x) (a <x < b), 

n = 1 

the series converging uniformly on [a, b] 9 then 

r b oo r b 

[ fda = Z I /« 

" a n = 1 ^ a 

In other words, the series may be integrated term by term. 


UNIFORM CONVERGENCE AND DIFFERENTIATION 

We have already seen, in Example 7.5, that uniform convergence of {/„} implies 
nothing about the sequence (/,;}. Thus stronger hypotheses are required for the 
assertion that/; -►/' if /„ -►/. 

7.17 Theorem Suppose {/„} is a sequence of functions , differentiable on [ a , b] 
and such that {f n {x 0 )} converges for some point x Q on [ a , b]. If iff) converges 
uniformly on [a, b ], then {/„} converges uniformly on [ a , b], to a function f and 

(27) f'(x) = lim/„'(x) (a<x < b). 

n-> oo 

Proof Let e > 0 be given. Choose N such that n > N, m > N, implies 

(28) \fn(x 0 )-f m (x 0 )\< e - 
and 

£ 


(29) 




(a <t < b). 
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If we apply the mean value theorem 5.19 to the function f n — f m , (29) 
shows that 


(30) \f n (x) -fjx) - fM + fjt) i < < e - 

for any x and t on [a, b] 9 if n > N, m > N. The inequality 

\fn{x) -fjx) I < \f„(x) -fjx) -f„(x 0 ) +f m (x 0 ) I + I f„(x 0 ) -fjx 0 ) I 
implies, by (28) and (30), that 

\fn(x) -fjx) I < £ (a <x< b, n>N,m> N), 
so that (/„} converges uniformly on [a, b ]. Let 

f(x) = Iim/„(X) (a <x < b). 


Let us now fix a point * on [a, b] and define 


(31) 


4>n(0 = 


m -ux) 


t - a: 


m = 


f(0 -Ax) 


for a < t < b, t ^ x. Then 

(32) lim <t>„{t) =fjx) («= 1,2,3,...). 

t->x 

The first inequality in (30) shows that 

- 4>m(l) I < — \ ( n>N,m> N), 

2 (b - a) 

so that {(/>„} converges uniformly, for t # Since (/,} converges to /, we 
conclude from (31) that 


(33) lim 4> n {t) = (j)(t ) 

n~* oo 

uniformly for a < t < b, t # x. 

If we now apply Theorem 7.1 1 to {</>„}, (32) and (33) show that 
lim 4>(t) = lim/„'(Ar); 

t —* x n —* x> 

and this is (27), by the definition of </>(/). 

Remark : If the continuity of the functions /„' is assumed in addition to 
the above hypotheses, then a much shorter proof of (27) can be based on 
Theorem 7.16 and the fundamental theorem of calculus. 
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7.18 Theorem There exists a real continuous function on the real line which is 
nowhere differentiable. 

Proof Define 


(34) 


<p(x) = kl (-1 <•* < i) 


and extend the definition of ip(x) to all real x by requiring that 


(35) (p(x + 2) = (p(x ). 

Then, for all s and /, 

(36) l<p(*)-<p(0l ^ k-'l- 
In particular, (p is continuous on R l . Define 

(37) f(x) = £ (if (pi 4-jc). 

n = 0 

Since 0<<p<l, Theorem 7.10 shows that the series (37) converges 
uniformly on R l . By Theorem 7. 12, /is continuous on R l . 

Now fix a real number x and a positive integer m. Put 

(38) <5 m = ±i-4“" 

where the sign is so chosen that no integer lies between 4 m x and 4 m (x -I- S m ). 
This can be done, since 4 m |<5 m | = Define 


(39) 


<p(4 m (x + 5J) - <P( 4 n x) 


When n > m, then 4 n S m is an even integer, so that y n = 0. When 0 <n 
(36) implies that |y„ | < 4 n . 

Since |y m | =?= 4 m , we conclude that 


fix + S m ) -fix) 



m — 1 

> 3"« _ £ 3" 

n = 0 

= i(3 m + 1). 


As m oo, S m 0. It follows that / is not differentiable at x. 


EQUICONTINUOUS FAMILIES OF FUNCTIONS 

In Theorem 3.6 we saw that every bounded sequence of complex numbers 
contains a convergent subsequence, and the question arises whether something 
similar is true for sequences of functions. To make the question more precise, 
we shall define two kinds of boundedness. 
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7.19 Definition Let{/„} be a sequence of functions defined on a set E. 

We say that{/„} is pointwise bounded on E if the sequence {/„(*)} is bounded 
for every x e £, that is, if there exists a finite-valued function </> defined on E 
such that 


1/nO) I < <P(x) (xeE,n= 1,2,3,...). 

We say that {/„} is uniformly bounded on E if there exists a number M 
such that 


\f n {x)\<M (x e E, n = 1, 2, 3, . . .). 

Now if {/„} is pointwise bounded on E and E x is a countable subset of E , 
it is always possible to find a subsequence {/ nk } such that{/„ k (x)} converges for 
every xe£,. This can be done by the diagonal process which is used in the 
proof of Theorem 7.23. 

However, even if {/„} is a uniformly bounded sequence of continuous 
functions on a compact set £, there need not exist a subsequence which con- 
verges pointwise on E. In the following example, this would be quite trouble- 
some to prove with the equipment which we have at hand so far, but the proof 
is quite simple if we appeal to a theorem from Chap. 11. 


7.20 Example Let 

f n {x) = sin nx (0 < x < 2jz 9 n = 1, 2, 3, . . .). 

Suppose there exists a sequence { n k } such that {sin n k x} converges, for every 
x e [0, 2tz]. In that case we must have 

lim (sin n k x — sin n k + l x) = 0 (0 < x ^ 2n); 

k-*cc 

hence 

(40) lim (sin n k x — sin n k + l x) 2 = 0 (0 < x < 2n). 

k-+ oo 

By Lebesgue’s theorem concerning integration of boundedly convergent 
sequences (Theorem 11.32), (40) implies 


(41) 



n k x — sin n k + i x) 2 


dx = 0. 


But a simple calculation shows that 



n k x — sin n k + l x) 2 dx = 2n> 


which contradicts (41). 
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Another question is whether every convergent sequence contains a 
uniformly convergent subsequence. Our next example will show that this 
need not be so, even if the sequence is uniformly bounded on a compact set. 
(Example 7.6 shows that a sequence of bounded functions may converge 
without being uniformly bounded; but it is trivial to see that uniform conver- 
gence of a sequence of bounded functions implies uniform boundedness.) 


7.21 Example Let 


/„(*) = 


(0 <x < 1,/f = 1, 2, 3, ...). 


x 2 + (1 — nx ) 2 

Then \f n (x)\ < 1, so that{/„} is uniformly bounded on [0, 1]. Also 
lim/„(.v) =0 (0 <.y < 1), 


but 


fn \n) = X ("= 1*2,3,...), 

so that no subsequence can converge uniformly on [0, 1]. 

The concept which is needed in this connection is that of equicontinuity ; 
it is given in the following definition. 


7.22 Definition A family $F of complex functions / defined on a set £ in a 
metric space X is said to be equicontinuous on E if for every c > 0 there exists a 
<5 > 0 such that 


i/w -/to i < £ 

whenever d(x, y) < <5, x e £, y e £, and fe $F . Here d denotes the metric of X. 

It is clear that every member of an equicontinuous family is uniformly 
continuous. 

The sequence of Example 7.21 is not equicontinuous. 

Theorems 7.24 and 7.25 will show that there is a very close relation 
between equicontinuity, on the one hand, and uniform convergence of sequences 
of continuous functions, on the other. But first we describe a selection process 
which has nothing to do with continuity. 

7.23 Theorem Is a pointwise bounded sequence of complex functions on 

a countable set E , then {/„} has a subsequence {/„ J such that {f nk (x)} converges for 
every x e E. 
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Proof Let {jc,}, / = 1, 2, 3, . . . , be the points of E, arranged in a sequence. 
Since {/.(^i)} is bounded, there exists a subsequence, which we shall 
denote by{/j ti J, such that {/, >fc (x,)} converges as k -► oo. 

Let us now consider sequences S l9 S 2 , S 3 , . . . , which we represent 
by the array 

•Si : fl,l fl ,2 fi ,3 f\ ,4 

‘S > 2 : f 2 , \ / 2,2 / 2,3 / 2,4 

^ 3 ! / 3,1 fz ,2 / 3,3 / 3,4 


and which have the following properties: 

(tf) is a subsequence of S n _ l9 for /7 = 2, 3, 4, 

(6) {/„,*(*„)} converges, as k -> oo (the boundedness of {/„(.*„)} 
makes it possible to choose in this way); 

(c) The order in which the functions appear is the same in each se- 
quence; i.e., if one function precedes another in Sj , they are in the same 
relation in every 5„, until one or the other is deleted. Hence, when 
going from one row in the above array to the next below, functions 
may move to the left but never to the right. 

We now go down the diagonal of the array; i.e., we consider the 
sequence 

*S‘ /l,l f 2,2 f 3 , 3 /*, 4*"* 

By (c), the sequence S (except possibly its first n — 1 terms) is a sub- 
sequence of S n , for n= 1,2,3, — Hence (6) implies that {/„,„(*,)} 
converges, as n -> oo, for every x, e E. 

124 Theorem If K is a compact metric space , if f n e #( K ) for n = 1 , 2, 3, ... , 
and //{/„} converges uniformly on K, then {/„} is equicontinuous on K. 

Proof Let e > 0 be given. Since {/„} converges uniformly, there is an 
integer N such that 

(42) II /„ -f N II < e (n > N). 

(See Definition 7.14.) Since continuous functions are uniformly con- 
tinuous on compact sets, there is a <5 > 0 such that 

(43) |/i(x) -/,(>>) | < e 

if 1 < i < N and d(x , y) < S. 

If n > N and d(x, y) < S, it follows that 

\fn(x) -fn(y) I < !/„(*) -Mx) | + \f N (x) -f N (y) | + \Uy) -f„(y) \ < 3e. 
In conjunction with (43), this proves the theorem. 
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7.25 Theorem If K is compact , if f n e r £(K) for n = 1, 2, 3, ... , and if {/„} is 
pointwise bounded and equicontinuous on K, then 

(a) {f n } is uniformly bounded on A\ 

( b ) {/„} contains a uniformly convergent subsequence. 

Proof 

(a) Let e > 0 be given and choose S > 0, in accordance with Definition 
7.22, so that 

(44) \fP) -f n (y) | < £ 

for all n 9 provided that d(x , y) < S. 

Since /w is compact, there are finitely many points p l9 ...,/>, in /w 
such that to every x e K corresponds at least one p { with d(x , /?,) < <5. 
Since [/„} is pointwise bounded, there exist M x < oo such that | f n (pi) | < M { 
for all n. If M = max (M l9 . . . , A/ r ), then \f(x) \ < M + e for every 
This proves (a). 

(b) Let E be a countable dense subset of K. (For the existence of such a 
set £*, see Exercise 25, Chap. 2.) Theorem 7.23 shows that {/„} has a 
subsequence {/„.} such that {/„.(*)} converges for every x e E. 

Put f ni =g if to simplify the notation. We shall prove that {g,} 
converges uniformly on K. 

Let e > 0, and pick S > 0 as in the beginning of this proof. Let 
V(x 9 S) be the set of all y e K with d(x 9 y) < S. Since E is dense in K , and 
K is compact, there are finitely many points x l9 ...,x m in E such that 

(45) K cz V{x l9 6)v-vV(x m9 5). 

Since {#,(*)} converges for every x e E 9 there is an integer N such 

that 

(46) |0.-(* S ) - dps) I < £ 

whenever / > N f j > N 9 1 <s<m. 

If x e K 9 (45) shows that x e V(x 5 , S) for some s 9 so that 

\g,(x) - gPs) I < £ 

for every /. If / > TV and j > TV, it follows from (46) that 


k-M - gp) I ^ \gp) - gPs) I + \gP,) - gp s ) I + I gp s ) - gp) I 

< 3a. 


This completes the proof. 
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THE STONE- WEIERSTR ASS THEOREM 


7.26 Theorem If f is a continuous complex function on [ a , b] 9 there exists a 
sequence of polynomials P n such that 

lim P n (x) = f(x) 

n-> jc 

uniformly on [ a , b\. Iff is real , the P n may be taken real. 

This is the form in which the theorem was originally discovered by 
Weierstrass. 

Proof We may assume, without loss of generality, that [a, b] = [0, 1]. 
We may also assume that /(0) = /(l) = 0. For if the theorem is proved 
for this case, consider 

g(x) =f(x) -/(0) - x[f(\) -/(0)] (0 < x < 1). 

Here g(0) = g(\) = 0, and if g can be obtained as the limit of a uniformly 
convergent sequence of polynomials, it is clear that the same is true for f 
since f—g is a polynomial. 

Furthermore, we define f(x) to be zero for x outside [0, 1]. Then / 
is uniformly continuous on the whole line. 

We put 

(47) Q n (x) = c„(l - x 2 f 


(«= 1 , 2 , 3 ,...), 


where c„ is chosen so that 

.t 


(48) 


f Q n (x)dx= 1 («= 1,2,3,...). 

J - 1 

We need some information about the order of magnitude of c„. Since 

[' (1 -x 2 ydx= 2 f (1 -x 2 )" dx> 2 f I/n/ " (1 -x 2 ) n dx 

J-l J 0 •'O 

r 1 /y/h 

^2 ( \-nx 2 )dx 

Jf) 


it follows from (48) that 


3 yjn 

1 

T* 


(49) 


< Jn. 
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The inequality (1 — x 2 ) n > 1 — nx 2 which we used above is easily 
shown to be true by considering the function 

(1 — x 2 ) n — 1 + nx 2 

which is zero at x = 0 and whose derivative is positive in (0, 1). 

For any <5 > 0, (49) implies 

(50) Q n (x) < Jn (1 - d 2 )” (<5 < \x\< 1), 

so that Q n -► 0 uniformly in 5 < \x | < 1. 

Now set 

(51) P n (x) = j l J(x + t)Q n (t ) dt (0 < * < 1). 

Our assumptions about / show, by a simple change of variable, that 

PnM =( i ~ X f( X + t)Qn(0 dt = f /(/)&(/ - X) dt, 

J -x J 0 

and the last integral is clearly a polynomial in x. Thus {PJ is a sequence 
of polynomials, which are real if /is real. 

Given £ > 0, we choose S > 0 such that \y — x \ <5 implies 

1/00 - /Ml <y 

Let M = sup \f{x) |. Using (48), (50). and the fact that Q n {x)> 0, we 
see that for 0 < x < 1, 

\PnW -m r= f 1 mx + /) -Ax)]Q n (t) dt 

j - 1 

< \f(x + t)-f(x)\Q„(t)dt 

<2 M( + Qn(‘)dt + 2M\ l Q n (t)dt 

J - 1 -6 J d 

<AMjn (1 — <5 2 )" + ^ 

< £ 

for all large enough n , which proves the theorem. 

It is instructive to sketch the graphs of Q n for a few values of /?; also, 
note that we needed uniform continuity of / to deduce uniform convergence 
of {/>„}. 
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In the proof of Theorem 7.32 we shall not need the full strength of 
Theorem 7.26, but only the following special case, which we state as a corollary. 

7.27 Corollary For every interval [— a, a] there is a sequence of real poly- 
nomials P„ such that P n (0 ) = 0 and such that 

lim />„(*)= \x | 

n —* oo 

uniformly on [ — a, a]. 

Proof By Theorem 7.26, there exists a sequence {P*} of real polynomials 
which converges to |x| uniformly on [— a, a]. In particular, P*( 0)->0 
as n -> oo. The polynomials 

p n (x) = p:(x)-p:( 0) (n = 1,2,3,...) 

have desired properties. 

We shall now isolate those properties of the polynomials which make 
the Weierstrass theorem possible. 

7.28 Definition A family .sd of complex functions defined on a set E is said 
to be an algebra if (i )f+gesd. (ii) fg e sd . and (iii) cf e ,vd for all fe sd , g e sd 
and for all complex constants c, that is, if sd is closed under addition, multi- 
plication, and scalar multiplication. We shall also have to consider algebras of 
real functions; in this case, (iii) is of course only required to hold for all real c. 

If sd has the property that f e sd whenever f n esd (n = 1, 2, 3, . . .) and 
f n ->f uniformly on £, then sd is said to be uniformly closed. 

Let 3 be the set of all functions which are limits of uniformly convergent 
sequences of members of sd . Then 3 is called the uniform closure of sd . (See 
Definition 7.14.) 

For example, the set of all polynomials is an algebra, and the Weierstrass 
theorem may be stated by saying that the set of continuous functions on [ a , b] 
is the uniform closure of the set of polynomials on [ a , b\. 

7.29 Theorem Let 3 be the uniform closure of an algebra sd of bounded 
functions. Then 3 is a uniformly closed algebra. 

Proof If f e 3 and g e 3, there exist uniformly convergent sequences 
{/„},{£„} such that /„ ->/, g n -+g and./;, e sd , g n e sd. Since we are dealing 
with bounded functions, it is easy to show that 

/„ +9n~>f+ 9, fn9n cf, 

where c is any constant, the convergence being uniform in each case. 
Hence f + g e @,fg e 31, and cf e 3, so that 3 is an algebra. 

By Theorem 2.27, 3 is (uniformly) closed. 
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7.30 Definition Let s4 be a family of functions on a set E. Then s4 is said 
to separate points on E if to every pair of distincts point jc 15 x 2 e E there corre- 
sponds a function /e^ such that f(x { ) # /(j c 2 ). 

If to each x e E there corresponds a function g e sd such that g(x) ^ 0, 
we say that s4 vanishes at no point of E. 

The algebra of ail polynomials in one variable clearly has these properties 
on R l . An example of an algebra which does not separate points is the set of 
all even polynomials, say on [—1,1], since f( — x)=f(x) for every even function /. 
The following theorem will illustrate these concepts further. 

7.31 Theorem Suppose s4 is an algebra of functions on a set E , s4 separates 
points on E, and s4 vanishes at no point of E. Suppose x l9 x 2 are distinct points 
of £, and c l9 c 2 are constants ( real if s4 is a real algebra). Then s4 contains a 
function f such that 

f(x i) = C U f(x 2 ) = c 2 . 

Proof The assumptions show that s4 contains functions g y h y and k 
such that 

0(*i) * 9(x 2 )> Kx i) # 0, k{x 2 ) # 0. 

Put 

u = gk — g(x x )k> v = gh - g(x 2 )h. 

Then u e s/, v e u(x t ) = v(x 2 ) = 0, u(x 2 ) ^ 0, and v(x { ) # 0. Therefore 

c x v c 2 u 
v(x t ) u(x 2 ) 

has the desired properties. 

We now have all the material needed for Stone’s generalization of the 
Weierstrass theorem. 

7.32 Theorem Let s4 be an algebra of real continuous functions on a compact 
set K. If s4 separates points on K and if $4 vanishes at no point of K , then the 
uniform closure 38 of s4 consists of all real continuous functions on K. 

We shall divide the proof into four steps. 

step 1 Iff e @ 9 then \f\ e 38. 

Proof Let 


(52) 


a = sup |/(x) | (xeK) 
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and let e > 0 be given. By Corollary 7.27 there exist real numbers 
c x , . . . , c n such that 

(53) £ c t y‘ - \y | <£ ( -a<y<a ). 

i= 1 

Since 09 is an algebra, the function 

g-icj* 

i= 1 

is a member of 09. By (52) and (53), we have 

\g(x) - \f(x) 1 1 < e (xeK). 

Since 09 is uniformly closed, this shows that \f\ e 09. 

step 2 If f e 09 and g e 09, then max (/, g) e 09 and min (/, g) e 09. 

By max (f g) we mean the function h defined by 

L/ \ _ (/(*) if fix) > g(x), 

1 \g(x) if f{x) < g(x), 

and min (/, g) is defined likewise. 


Proof Step 2 follows from step 1 and the identities 
max (/, g ) = + — - — , 


min (/, g) = 


f+9 I f~9\ 


By iteration, the result can of course be extended to any finite set 
of functions : If/j, . . . ,f n e 09, then max (f l9 ...,/„) e 09, and 

min (/i, 


step 3 Given a real function f continuous on K, a point x e K, and e > 0, there 
exists a function g x e 09 such that g x (x) = f{x) and 

(54) g x (t) >f(t) — e (t e K). 

Proof Since sf cz 09 and sd satisfies the hypotheses of Theorem 7.31 so 
does 09. Hence, for every y e K, we can find a function h y e 09 such that 

h y (x) = f{x), h y (y) = fiy). 


(55) 
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By the continuity of h y there exists an open set J y , containing y f 
such that 

(56) \(0 >/(/)-« (/e/,). 

Since is compact, there is a finite set of points y lt y„ such that 

(57) 

Put 


= max (h y h y J. 

By step 2, g e and the relations (55) to (57) show that g x has the other 
required properties. 

step 4 Given a real function f continuous on K , and e > 0, there exists a function 
h g & such that 

(58) \h(x)-f(x)\ < e (x e K). 

Since & is uniformly closed, this statement is equivalent to the conclusion 
of the theorem. 



Proof Let us consider the functions g x , 

for each x e K, constructed in 


step 3. By the continuity of g xi there exist open sets V x containing x, 
such that 

(59) 

gJj) <f(t ) + e 

(t 6 V x ). 


Since K is compact, there exists a 
such that 

finite set of points x l ,...,x m 

(60) 

K<= V Xl u-u 

Put 

v Xm . 


h = min (g xi , 

By step 2, h e and (54) implies 

. ffxj- 

(61) 

h(t)>f(t)-e 

whereas (59) and (60) imply 

(leK), 

(62) 

/;(/) </(/) + e 

Finally, (58) follows from (61) and (62). 

(t 6 K). 
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Theorem 7.32 does not hold for complex algebras. A counterexample is 
given in Exercise 21. However, the conclusion of the theorem does hold, even 
for complex algebras, if an extra condition is imposed on sd, namely, that sd 
be self-adjoint. This means that for every / e sd its complex conjugate / must 
also belong to s4 \ j is defined by f(x) = f(x). 

7.33 Theorem Suppose sd is a self-adjoint algebra of complex continuous 
functions on a compact set K , stf separates points on K , and sd vanishes at no 
point of K. Then the uniform closure ^ of ^ consists of all complex continuous 
functions on K. In other words , sd is dense K ). 

Proof Let sf R be the set of all real functions on K which belong to $4 . 

If / e $4 and f = u + iv, with u , v real, then 2u = / + /, and since s4 
is self-adjoint, we see that u e s4 R . If x { # x 2 , there exists / e s4 such 
that f(xj)= 1 >f( x i) = 0; hence 0 = u(x 2 ) # u{*\) = 1, which shows that 
s/ R separates points on K. If x e K , then g(x) # 0 for some g e jaf, and 
there is a complex number 2 such that 2g(x) >0; if / = 2g,f = u 4- ii\ it 
follows that u(x) > 0; hence sd R vanishes at no point of K. 

Thus s/ R satisfies the hypotheses of Theorem 7.32. It follows that 
every real continuous function on K lies in the uniform closure of stf R , 
hence lies in If / is a complex continuous function on K, f = u +iv 9 
then ue@, rel, hence / e 88. This completes the proof. 


EXERCISES 

1. Prove that every uniformly convergent sequence of bounded functions is uni- 
formly bounded. 

2. If {/„} and { g „ } converge uniformly on a set E, prove that {/„ + g„] converges 
uniformly on E. If, in addition, {/„} and { g „ } are sequences of bounded functions, 
prove that { f n g n } converges uniformly on E. 

3. Construct sequences {/„}, {g n } which converge uniformly on some set E, but such 
that { f„g „ } does not converge uniformly on E (of course, {f,g n } must converge on 
E). 

4. Consider 


00 

fix) = Z 


1 

1 + n 2 x 


For what values of x does the series converge absolutely? On what intervals, does 
it converge uniformly? On what intervals does it fail to converge uniformly ? Is/ 
continuous wherever the series converges ? Is / bounded ? 
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5. Let 



Show that {/„} converges to a continuous function, but not uniformly. Use the 
series £ f„ to show that absolute convergence, even for all *, does not imply uni- 
form convergence. 

6. Prove that the series 


£ 

n-l 


(-i r 


x 2 -f n 
n 2 


converges uniformly in every bounded interval, but does not converge absolutely 
for any value of x. 

7. For n = 1, 2, 3, . . . , x real, put 


/.(*) = 


X 

1 nx 2 


Show that {/„} converges uniformly to a function /, and that the equation 

f\x) = Yrnf’(x) 

n~* oo 


is correct if x # 0, but false if x = 0. 

8. If 


/(*) = 


(*< 0 ), 

(x > 0), 


if {A: n } is a sequence of distinct points of ( a , b\ and if 2|c„| converges, prove that 
the series 


fix) = X c " I(x — x n ) ( a<x< b) 

n = 1 

converges uniformly, and that /is continuous for every x ^ x„. 

9. Let {/„} be a sequence of continuous functions which converges uniformly to a 
function / on a set E. Prove that 

lim f n ix n ) =f(x) 

rt -* co 

for every sequence of points x n e E such that x n x , and x e E. Is the converse of 
this true? 
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10 . Letting (*) denote the fractional part of the real number x (see Exercise 16, Chap. 4, 
for the definition), consider the function 

fix) = z — ~T i x rea| ). 

n=l tr 

Find all discontinuities of /, and show that they form a countable dense set. 
Show that / is nevertheless Riemann-integrable on every bounded interval. 

11 . Suppose {/„}, {g n } are defined on E , and 

(a) 2 /„ has uniformly bounded partial sums; 

(b) g„-> 0 uniformly on E\ 

(c) gi(x) >g 2 (x) >g 3 (x) > • • • for every x e E. 

Prove that 2 f„g n converges uniformly on E. Hint : Compare with Theorem 

3.42. 

12 . Suppose g and f„(n = 1, 2, 3, . . .) are defined on (0, oo), are Riemann-integrable on 
[/, T] whenever 0 < t < T < oo, |/„| <g,f n ->/ uniformly on every compact sub- 
set of (0, oo ), and 

g(x) dx < oo. 

J o 

Prove that 

- 00 00 
lim j fix) dx= j /(x) dx. 

n-*co Jq Jq 

(See Exercises 7 and 8 of Chap. 6 for the relevant definitions.) 

This is a rather weak form of Lebesgue’s dominated convergence theorem 
(Theorem 11.32). Even in the context of the Riemann integral, uniform conver- 
gence can be replaced by pointwise convergence if it is assumed that f e (See 
the articles by F. Cunningham in A lath. Mag ., vol. 40, 1967, pp. 179-186, and 
by H. Kestelman in Amer. Math. Monthly , vol. 77, 1970, pp. 182-187.) 

13 . Assume that {/„} is a sequence of monotonically increasing functions on R l with 
0 <f n (x) < 1 for all jc and all n. 

(a) Prove that there is a function / and a sequence {n k } such that 

f(x)= lim f nk (x) 

k-+ oo 

for every x e R l . (The existence of such a pointwise convergent subsequence is 
usually called Helly's selection theorem.) 

( b ) If, moreover, /is continuous, prove that f„ k ->/ uniformly on R l . 

Hint : (i) Some subsequence {/„,} converges at all rational points r, say, to 
/(/*). (ii) Define /(*), for any jc e R\ to be sup /(/*), the sup being taken over all 
r<jc. (iii) Show that /„,(*)->/(*) at every x at which / is continuous. (This is 
where monotonicity is strongly used.) (iv) A subsequence of {/„,} converges at 
every point of discontinuity of / since there are at most countably many such 
points. This proves (a). To prove ( b) f modify your proof of (iii) appropriately. 
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14. Let / be a continuous real function on R 1 with the following properties: 
0 < f(t) < l,/0 + 2) = f(t) for every t , and 


/(0 = 


Put ®(t) = (x(t ), y(t j), where 




*(0 =£ 2-/(3 ’-‘O, 

n= 1 


XO=Z2-"/(3 j "0. 


n = 1 


Prove that <X> is continuous and that <X> maps I = [0, 1] onto the unit square I 2 R 2 . 
If fact, show that <X> maps the Cantor set onto I 2 . 

Hint: Each 0c o , yo ) e I 2 has the form 

00 00 

Xo = z 2“"a 2 «-i, ^o=Z 2 '"oj» 


where each a t is 0 or 1 . If 


ro = Z3- , - 1 (2« i ) 

t = i 

show that /(3*7o) = a k , and hence that x(t 0 ) = x 0 , y(t 0 ) = y 0 . 

(This simple example of a so-called “space-filling curve” is due to I. J. 
Schoenberg, Bull. vol. 44, 1938, pp. 519.) 

15 . Suppose /is a real continuous function on R\f„(t) = f{nt) for n = 1, 2, 3, ... , and 
{/„} is equicontinuous on [0, 1]. What conclusion can you draw about /? 

16 . Suppose {/„} is an equicontinuous sequence of functions on a compact set K , and 
{/„} converges pointwise on K. Prove that {/„} converges uniformly on K. 

17 . Define the notions of uniform convergence and equicontinuity for mappings into 
any metric space. Show that Theorems 7.9 and 7.12 are valid for mappings into 
any metric space, that Theorems 7.8 and 7.11 are valid for mappings into any 
complete metric space, and that Theorems 7.10, 7.16, 7.17, 7.24, and 7.25 hold for 
vector-valued functions, that is, for mappings into any R k . 

18 . Let {/„} be a uniformly bounded sequence of functions which are Riemann-inte- 
grable on [a, b], and put 

F n (x) = f f„(t)dt ( a<x<b ). 

J a 

Prove that there exists a subsequence {F nfc } which converges uniformly on [a, b], 

19 . Let K be a compact metric space, let 5 be a subset of #( K ). Prove that 5 is compact 
(with respect to the metric defined in Section 7.14) if and only if 5 is uniformly 
closed, pointwise bounded, and equicontinuous. (If 5 is not equicontinuous, 
then 5 contains a sequence which has no equicontinuous subsequence, hence has 
no subsequence that converges uniformly on K.) 
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20. If /is continuous on [0, 1] and if 

f f(x)x n dx = 0 (n = 0,1,2,...), 

prove that f(x) = 0 on [0, 1], Hint: The integral of the product of / with any 
polynomial is zero. Use the Weierstrass theorem to show that J q f 2 (x) dx — 0. 

21 . Let K be the unit circle in the complex plane (i.e., the set of all z with \z\ = 1), and 
let sf be the algebra of all functions of the form 

f(e l °) = X c„e ln0 (9 real). 

n = 0 

Then s/ separates points on K and stf vanishes at no point of K , but nevertheless 
there are continuous functions on K which are not in the uniform closure of $4 . 
Hint: For every f e s4 

2lt 

f(e te )e ie d9= 0, 

) 

and this is also true for every / in the closure of sf . 

22 . Assume / e^P(a) on [ a , b ), and prove that there are polynomials P„ such that 

,b 

lim I |/-P„| J <fa = 0. 

n-oo J a 

(Compare with Exercise 12, Chap. 6.) 

23. Put P 0 = 0, and define, for n 0, 1 , 2, . . . , 

Pn + lW= Pn{x) H . 


Prove that 


lim P„(x) =|x|, 

It-* Tj 


uniformly on [— 1 , 1]. 

(This makes it possible to prove the Stone- Weierstrass theorem without first 
proving Theorem 7.26.) 

Hint: Use the identity 


\x\ -P„ + iW = [Ul -Pn(x)] 


1 _ 1 *1 + /> "(*> 


to prove that 0 < P„(x) < P„ + x (a:) < | a: | if | a: | < 1 , and that 


— P n (x) < 



< 


2 

n -hi 


if \x\ < 1 . 
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24. Let X be a metric space, with metric d. Fix a point a e X. Assign to each p e X 
the function f p defined by 

f p (x) = d(x, p) - d(x, a) (x e X). 

Prove that \f P (x) \ <d(a,p) for all x g X, and that therefore f p e #( X ). 

Prove that 

\\f P -f q \\ = d(p y q) 

for all p,q e X. 

If <£(/?) = f p it follows that O is an isometry (a distance-preserving mapping) 
of X onto O(X) <= <£(X). 

Let Y be the closure of O^) in #( X ). Show that Y is complete. 

Conclusion: X is isometric to a dense subset of a complete metric space Y. 
(Exercise 24, Chap. 3 contains a different proof of this.) 

25 . Suppose <f> is a continuous bounded real function in the strip defined by 
0 < x < 1, — oo <y < oo. Prove that the initial-value problem 

/ = y), 7(0) = c 

has a solution. (Note that the hypotheses of this existence theorem are less stringent 
than those of the corresponding uniqueness theorem; see Exercise 27, Chap. 5.) 

Hint: Fix n. For / = 0, . . . , n, put x t = i/n . Let /„ be a continuous function 
on [0, 1 ] such that /„(0) = c, 

fn{t)= <t>{Xi,f n {Xi)) If Xi <t <X, + li 

and put 

A .(0=/#)-#f,/.(0), 
except at the points x t , where A „(t) = 0. Then 


f n (x) = c + f [<£(', MO) + A n (01 dt. 

J 0 

Choose M < oo so that |<^| < M. Verify the following assertions. 

(a) \fh\ < M, | A„| < 2M, A„g^, and \f„\ < \c\ 4- M = M u say, on [0, 1], for 
all n. 

( b ) {/„} is equicontinuous on [0, 1 ], since | f'„ | < M. 

(c) Some {/„*} converges to some /, uniformly on [0, 1]. 

(d) Since <£ is uniformly continuous on the rectangle 0 < x < 1, \y\ <M t , 

uniformly on [0, 1]. 

(< e ) A„(f)->0 uniformly on [0, 1], since 

* n (t) = <f>(x i J n (x l ))-<f>(tJ n (t)) 


in (x t ,xi + 1 ). 
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(/) Hence 


fix) = c + fV'./(0) 

•'o 




This / is a solution of the given problem. 

26. Prove an analogous existence theorem for the initial-value problem 

y' = «*>(*, y), y(0) = c, 

where now c e R k y y e R k , and 4> is a continuous bounded mapping of the part of 
R k + l defined by 0 < x < 1, y e R k into R k . (Compare Exercise 28, Chap. 5.) Hint: 
Use the vector-valued version of Theorem 7.25. 



8 

SOME SPECIAL FUNCTIONS 


POWER SERIES 

In this section we shall derive some properties of functions which are represented 
by power series, i.e. , functions of the form 

(1) /(*) = £<■.*" 

n = 0 

or, more generally, 

(2) f(x) = t c„(x - a)'. 

. 1=0 

These are called analytic functions. 

We shall restrict ourselves to real values of a. Instead of circles of con- 
vergence (see Theorem 3.39) we shall therefore encounter intervals of conver- 
gence. 

If (1) converges for all x in (— R, R ), for some R > 0 (R may be +oo), 
we say that /is expanded in a power series about the point x = 0. Similarly, if 
(2) converges for \x — a\ < /?, / is said to be expanded in a power series about 
the point x - a. As a matter of convenience, we shall often take a = 0 without 
any loss of generality. 
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8.1 Theorem Suppose the series 

(3) f >„x" 

n = 0 

converges for | x| < R, and define 

(4) f(x)=Y i c n x n (|x| < R). 

n = 0 

Then (3) converges uniformly on [ — R + e, R — e], no matter which e > 0 
is chosen. The function f is continuous and differentiable in ( — R, R), and 

(5) /'(*) = £ nc n x"~ l (|x|</?). 

n = 1 

Proof Let e > 0 be given. For \x\ < R — e, we have 

\c„x n \ < | c„(R - £) n | ; 

and since 


Zc n (R-e) n 

converges absolutely (every power series converges absolutely in the 
interior of its interval of convergence, by the root test), Theorem 7.10 
shows the uniform convergence of (3) on [— R + e, R — e]. 

Since ^j'n -> 1 as n -> oo, we have 

lim sup y'n\c,,\ = lim sup f/jcj , 

n~* oc n-* og 

so that the series (4) and (5) have the same interval of convergence. 

Since (5) is a power series, it converges uniformly in [ — R + e, 
R — e], for every e > 0, and we can apply Theorem 7.17 (for series in- 
stead of sequences). It follows that (5) holds if |jc| < R — e. 

But, given any a* such that |a| < R, we can find an e > 0 such that 
| a | < R — e. This shows that (5) holds for |x| < R. 

Continuity of / follows from the existence of /' (Theorem 5.2). 

Corollary Under the hypotheses of Theorem 8.1, / has derivatives of all 
orders in (— R, R ), which are given by 

(6) f (k) ( x ) = Z n ( n ~ 0 * • * (n — k -f 1 )c n x n ~ k . 

n = k 

In particular , 

(7) / (t> (0) = k \c k (k = 0,1,2,...). 

(Here / (0> means /, and / (ll) is the Arth derivative of /, for k = 1, 2, 3, . . .)• 
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Proof Equation (6) follows if we apply Theorem 8.1 successively to / 
Putting x = 0 in (6), we obtain (7). 

Formula (7) is very interesting. It shows, on the one hand, that the 
coefficients of the power series development of / are determined by the values 
of / and of its derivatives at a single point. On the other hand, if the coefficients 
are given, the values of the derivatives of / at the center of the interval of con- 
vergence can be .read off immediately from the power series. 

Note, however, that although a function / may have derivatives of all 
orders, the series Z.c n x n , where c n is computed by (7), need not converge to /(. x) 
for any x # 0. In this case,/ cannot be expanded in a power series about x = 0. 
For if we had f(x) = 'La n x n , we should have 

n\a n =f< m \ 0); 

hence a n = c n . An example of this situation is given in Exercise 1. 

If the series (3) converges at an endpoint, say at x = R, then /is continuous 
not only in (- R, R), but also at x = R. This follows from Abel’s theorem (for 
simplicity of notation, we take R = 1): 

8.2 Theorem Suppose Zc„ converges. Put 

/(*) = X c„y ( i < x < i ). 

n = 0 


Then 

(8) lim/(x) = X c„ • 

x~* 1 n = 0 


Proof Let s n = c 0 + • • • + c n , S-. x =0. Then 

m m m- 1 

X c » x " = X (•*<. - s.-i)*” = (1 - x) X S„x" + s m x m . 

n = 0 n — O n = 0 

For | x\ < 1, we let m -> oo and obtain 

/(x) = (l -x)ts n x". 

n= 0 

Suppose s = lim s n . Let e > 0 be given. Choose N so that n> N 

n-*0 

implies 


I*- 



( 9 ) 
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Then, since 


(i i 


n = 0 


(M < i), 


we obtain from (9) 


l/o*o - *1 = 


0 - *) Z - •*)•*" 

n — O 


^ 0 - x)Z k - J l M" + ^ <e 

n = 0 ^ 


if x > 1 — <5, for some suitably chosen <5 > 0. This implies (8). 

As an application, let us prove Theorem 3.51, which asserts: //Ta„, X6„, 
, converge to A , B , C, and if c n = a 0 b n + • • • + a n b 0 , then C = AB. We let 

fix) = Z a n x", 9(x) = Z b n x\ A(x) = Z C„ 

n = 0 n = 0 n= 0 

for 0 < x < 1. For x < \, these series converge absolutely and hence may be 
multiplied according to Definition 3.48; when the multiplication is carried out, 
we see that 


(10) f(x) • g(x) = h(x) (0<x< 1). 

By Theorem 8.2, 

(11) f(x)^A 9 g(x) -> B, h(x) -► C 

as x -► 1. Equations (10) and (1 1) imply AB = C. 

We now require a theorem concerning an inversion in the order of sum- 
mation. (See Exercises 2 and 3.) 


8.3 Theorem Given a double sequence {a^}, / = 1, 2, 3, . . . , j = 1, 2, 3, . . . , 
suppose that 

(12) IKI=6, 0 = 1,2,3,...) 

j= 1 

and Ybi converges. Then 

00 OO CO 00 

(13) Z Z a ij = Z Z a v 

i= 1 7= 1 7=1 i = 1 

Proof We could establish (13) by a direct procedure similar to (although 
more involved than) the one used in Theorem 3.55. However, the following 
method seems more interesting. 
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Let E be a countable set, consisting of the points x 0 , x l9 x 2 , . . . , and 
suppose jc„ -► x 0 as n -► oo . Define 


00 


(14) 

fi(Xo) ^ &ij 

7 = 1 

(*•=1,2, 3,.. 

(15) 

a 

II 

"‘ja 

(/, *1=1,2, 3, 

(16) 

g(x ) = Z fix) 

(xeE). 


i= 1 


Now, (14) and (15), together with (12), show that each /, is con- 
tinuous at x 0 . Since |/,(x)| < b t for x e E, (16) converges uniformly, so 
that g is continuous at x 0 (Theorem 7.1 1). It follows that 

00 00 00 

Z I a u = Z fi(x o) = g(x o) = lim g(x„) 

i = 1 7=1 i=l n-*oc 

oo x n 

= lim Z /.W = lim X Z a u 

n~> cc i= 1 n-»ooi=l 7 =l 

n oo xx 

= lim Z Z«y = Z Z fl y 

n -* x y = 1 i = 1 j = 1 i = 1 

8.4 Theorem Suppose 

f(x)=fc„x\ 

n = 0 

the series converging in \x\ < R. If — R < a < R, then f can be expanded in a 
power series about the point x = a which converges in | x — a | < R — | a | , and 

(17) /(*)=£' — -^(x-a) n (|x-a| < R- |a|). 

n = 0 n : 

This is an extension of Theorem 5.15 and is also known as Taylor's 
theorem. 


X 


fix) = X C„[(x - a) + a] n 

n = 0 


= Z,Z (") a"- m (x-a) m 

n — 0 m = 0 \ m I 


Proof We have 
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This is the desired expansion about the point x = a. To prove its validity, 
we have to justify the change which was made in the order of summation. 
Theorem 8.3 shows that this is permissible if 


(18) 



a n m (x - a) m 


converges. But (18) is the same as 


(19) I k«l '(I*- a\ +|a|)", 

n = 0 

and (19) converges if \x — a\ + \a\ < R. 

Finally, the form of the coefficients in (17) follows from (7). 


It should be noted that (17) may actually converge in a larger interval than 
the one given by | x — a\ < R - \a \ . 

If two power series converge to the same function in (— /?, R ), (7) shows 
that the two series must be identical, i.e., they must have the same coefficients. 
It is interesting that the same conclusion can be deduced from much weaker 
hypotheses : 


8.5 Theorem Suppose the series Ta n x n and I converge in the segment 
S = ( — /?, R). Let E he the set of all x e S at which 

( 20 ) = £/>„*". 

m - 0 n = 0 

if E has a limit point in S, then a n = h n for n = 0, 1,2, Hence (20) holds for 

all x e S. 

Proof Put c n = a n - b n and 


( 21 ) 


f(x)=Y^c n x n (xeS). 

n = 0 


Then / (*) — 0 on E. 

Let A be the set of all limit points of E in 5, and let B consist of all 
other points of S. It is dear from the definition of “limit point” that B 
is open. Suppose we can prove that A is open. Then A and B are disjoint 
open sets. Hence they are separated (Definition 2.45). Since S = A u B, 
and S is connected, one of A and B must be empty. By hypothesis, A is 
not empty. Hence B is empty, and A = S. Since / is continuous in S , 
A <= E. Thus E = S, and (7) shows that c n = 0 for n = 0, 1,2,..., which 
is the desired conclusion. 
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Thus we have to prove that A is open. If x 0 e A , Theorem 8.4 shows 

that 

(22) Ax) = £ d n (x - x 0 y (|x - x 0 | < R - 1*0 1 )• 

n = 0 

We claim that d n = 0 for all n. Otherwise, let k be the smallest non- 
negative integer. such that d k / 0. Then 

(23) fix) = (x - x 0 ) k g(x) (\x-x 0 \<R- |x 0 |). 

where 

(24) g(x)=t d k + m (x-x 0 r. 

m = 0 

Since g is continuous at x 0 and 

g(x 0 ) = d k # 0, 

there exists a <5 > 0 such that g(x) ^ 0 if | jc — x 0 \ < S. It follows from 
(23) that f(x) ± 0 if 0 < \x — x 0 \ < S. But this contradicts the fact that 
x 0 is a limit point of E. 

Thus d n = 0 for all n, so that /(jc) = 0 for all .v for which (22) holds, 
i.e., in a neighborhood of jc 0 . This shows that A is open, and completes 
the proof. 


THE EXPONENTIAL AND LOGARITHMIC FUNCTIONS 


We define 
(25) 


E(z) = 



The ratio test shows that this series converges for every complex z. Applying 
Theorem 3.50 on multiplication of absolutely convergent series, we obtain 


oo 7 n cc 


E(z)E(w) = X — X — = I X 

-r. Ml _ _ r> m ' ..-n 


=o n\ m = 0 m\ „ = 0 k = o k !(n - k)\ 


co 1 n 

= 1 rr I u 

„ = o n\ k = o \ K 


= X 


(Z + M’) n 


which gives us the important addition formula 

(26) E(z 4- w) = £(z)£(h) (z, w complex). 

One consequence is that 

E(z)E( — z) = E(z — z) = E(0) = 1 (z complex). 


( 27 ) 
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This shows that E(z) ^ 0 for all z. By (25), E(x ) > 0 if x > 0; hence (27) shows 
that E(x) > 0 for all real x. By (25), E(x) ->H-ooasx->+oo; hence (27) shows 
that E(x) -+ 0 as x -+ — oo along the real axis. By (25), 0 < x < y implies that 
E(x) < E(y); by (27), it follows that E(-y) < E( — x); hence E is strictly in- 
creasing on the whole real axis. 

The addition formula also shows that 


(28) 


lim 

h = 0 


E(z + h) - E(z) 


= E(z) lim 

h = 0 


m - 1 


= E(z); 


the last equality follows directly from (25). 

Iteration of (26) gives 

(29) E(z { + ••• + z n ) = E(z x ) • • • E(z„). 

Let us take z x = • • • = z n = 1. Since £(1) = e , where e is the number defined 
in Definition 3.30, we obtain 

(30) E(n) = e n (n = 1,2,3,...). 

If p = n/m , where n , m are positive integers, then 

(31) [E(p)T = E(mp) = E{n) = e\ 
so that 

(32) E(p) = e p (p > 0, p rational). 

It follows from (27) that E(—p) = e~ p if p is positive and rational. Thus (32) 
holds for all rational p. 

In Exercise 6, Chap. 1, we suggested the definition 

(33) x y = sup x p , 

where the sup is taken over all rational p such that p < y, for any real y, and 
x > 1. If we thus define, for any real x, 

(34) e x = sup e p (p < x, p rational), 

the continuity and monotonicity properties of E , together with (32), show that 

(35) E(x) = e x 

for all real jc. Equation (35) explains why E is called the exponential function. 

The notation exp (x) is often used in place of e x , expecially when x is a 
complicated expression. 

Actually one may very well use (35) instead of (34) as the definition of e x \ 

(35) is a much more convenient starting point for the investigation of the 
properties of e x . We shall see presently that (33) may also be replaced by a 
more convenient definition [see (43)]. 
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We now revert to the customary notation, e x 9 in place of E(x), and sum- 
marize what we have proved so far. 


8.6 Theorem Let e x be defined on R 1 by (35) and (25). Then 

(a) e x is continuous and differentiable for all x; 

(b) (e*Y=e*; 

( c ) e x is a strictly increasing function of x, and e x > 0; 

(d) e ,+ ' = eV; 

(e) e x -> + co as x -> + co, e* -*0 as x-* — oo ; 

(/) lim x ^ + X x n e~ x = 0, for every n. 

Proof We have already proved (a) to (e); (25) shows that 

y.n + 1 



(n + 1)! 

for x > 0, so that 


.(" + !)! 


and (/) follows. Part (/) shows that e x tends to +oo “faster” than any 
power of as x -> + oo. 


Since E is strictly increasing and differentiable on R\ it has an inverse 
function L which is also strictly increasing and differentiable and whose domain 
is EiR 1 ), that is, the set of all positive numbers. L is defined by 

(36) E(L(y)) = y (y > 0), 

or, equivalently, by 

(37) L(E(x)) = x (x real). 

Differentiating (37), we get (compare Theorem 5.5) 

L\E(x)) • E(x) = 1. 

Writing y = E(j c), this gives us 

(38) L'(y) = - (>• > 0). 

y 


Taking x = 0 \n (37), we see that L(l) = 0. Hence (38) implies 

r y dx 


( 39 ) 


L(y) = f^. 

x: 
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Quite frequently, (39) is taken as the starting point of the theory of the logarithm 
and the exponential function. Writing u = E(x), v = E(y), (26) gives 

L(iw) = L(E(x) • £00) = L(E(x + y)) = x + y 9 

so that 

(40) L(uv) = L(u ) + L(v) (u > 0, v > 0). 

This shows that L has the familiar property which makes logarithms useful 
tools for computation. The customary notation for L(x) is of course log jc. 

As to the behavior of log jc as jc -► +oo and as jc-> 0, Theorem 8.6(e) 
shows that 

logjc-++oo asjc->+oo, 

log x — oo as jc — ► 0. 

It is easily seen that 

(41) x n = E(nL(x)) 

if jc > 0 and n is an integer. Similarly, if m is a positive integer, we have 

(42) x llm = E^L(x)\, 

since each term of (42), when raised to the /nth power, yields the corresponding 
term of (37). Combining (41) and (42), we obtain 

(43) jc* = E(olL(x)) = e* IogA 
for any rational a. 

We now define x a , for any real a and any jc > 0, by (43). The continuity 
and monotonicity of E and L show that this definition leads to the same result 
as the previously suggested one. The facts stated in Exercise 6 of Chap. 1, are 
trivial consequences of (43). 

If we differentiate (43), we obtain, by Theorem 5.5, 

(44) (jc a )' = £(aL(j c)) • - = ajc a " 1 . 

JC 

Note that we have previously used (44) only for integral values of a, in which 
case (44) follows easily from Theorem 5.3 (b). To prove (44) directly from the 
definition of the derivative, if is defined by (33) and a is irrational, is quite 
troublesome. 

The well-known integration formula for follows from (44) if a ^ — 1, 
and from (38) if a = — 1. We wish to demonstrate one more property of log jc, 
namely, 

(45) lim ;c~ a logjc = 0 

x-> + oo 
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for every a > 0. That is, log jc -> +oo “slower” than any positive power of jc, 
as x -> + oo. 

For if 0 < e < a, and jc > 1, then 

x~ a log jc = jc _<z J t~ l dt<x~ a J / e_1 dt 

_ n x E — 1 JC £_a 
= JC a • < , 

e e 


and (45) follows. We could also have used Theorem 8.6 (/) to derive (45). 


THE TRIGONOMETRIC FUNCTIONS 

Let us define 

(46) C(x) = i [£(/x) + E( - /jc)], S(x) = 2 \E(ix) - E( - /*)]. 

2 2 z 

We shall show that C(jc) and S(jc) coincide with the functions cos jc and sin jc, 
whose definition is usually based on geometric considerations. By (25), E(z ) = 
E(z). Hence (46) shows that C( jc) and S(x) are real for real jc. Also, 

(47) E(ix) = C( jc) 4- iS( jc). 

Thus C(jc) and S(x) are the real and imaginary parts, respectively, of £(/jc), if 
jc is real. By (27), 

|£(zjc)| 2 = E(ix)E(ix) = E(ix)E( — ix) = 1, 

so that 

(48) | E(ix) | = 1 (jc real). 

From (46) we can read off that C(0) = 1, S( 0) = 0, and (28) shows that 

(49) C\x)= -S(x), S'(x) = C(x). 

We assert that there exist positive numbers jc such that C(jc) = 0. For 
suppose this is not so. Since C(0) = 1, it then follows that C(jc) > 0 for all 
jc > 0, hence S'(x) > 0, by (49), hence 5 is strictly increasing; and since *5(0) = 0, 
we have S(x) > 0 if jc > 0. Hence if 0 < * < y, we have 

(50) S{x)(y - jc) < fs(t) dt = C(j c) - C(y) < 2. 

J X 

The last inequality follows from (48) and (47). Since 5*(jc) > 0, (50) cannot be 
true for large y , and we have a contradiction. 
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Let ;c 0 be the smallest positive number such that C(x 0 ) = 0. This exists, 
since the set of zeros of a continuous function is closed, and C(0) # 0. We 
define the number n by 

(51) 7r = 2x 0 . 

Then C(n/2) = 0, and (48) shows that S(n/2) = ±1. Since C(x) > 0 in 
(0, 7r/2), S is increasing in (0, n/2)\ hence S(n/2) = 1. Thus 



and the addition formula gives 

(52) £(7ti)=-1, E(2ni) = 1 ; 
hence 

(53) £(z 4- 2ni) = £(z) (z complex). 

8.7 Theorem 

(a) The function E is periodic , with period 2ni, 

( b ) The functions C and S are periodic , with period 2n. 

(c) If0<t< 2n, then £(/7) # 1. 

(d) If z is a complex number with \z \ = 1, there is a unique t in [0, 2n) 
such that £(/7) = z. 

Proof By (53), (a) holds; and ( b ) follows from {a) and (46). 

Suppose 0 < t < 7t/ 2 and E(it) = x 4- iy, with x, y real. Our preceding 
work shows that 0<*<1,0<> <1. Note that 

£(4/7) = (x 4- iy ) 4 = x 4 — 6 x 2 y 2 4- y 4 4- 4 ixy(x 2 — y 2 ). 

If £(4/7) is real, it follows that x 2 — y 2 =0; since x 2 4- y 2 = 1, by (48), 
we have x 2 = y 2 = hence £(4/7) = — 1. This proves (c). 

If 0 < tt < t 2 < then 

£(/7 2 )[£(/7 1 )]" 1 = £( i 7 2 - it,) # 1, 

by (c). This establishes the uniqueness assertion in (d). 

To prove the existence assertion in ( d ), fix z so that \z \ = 1. Write 
z = * + />’, with * and y real. Suppose first that * > 0 and y > 0. On 
[0, tt/2], C decreases from 1 to 0. Hence C(t ) = * for some t e [0, 7r/2]. 
Since C 2 4- S 2 = 1 and S > 0 on [0, 7r/2], it follows that z = £(/7). 

If * < 0 and y ^ 0. the preceding conditions are satisfied by — /z. 
Hence — iz = E(it) for some t e [0, 7r/2], and since / = E(ni/2 ), we obtain 
z = £(/(/ 4- n/2)). Finally, if y < 0, the preceding two cases show that 
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— z = E(it) for some t e (0, n). Hence z = — E{it) = E{i{t + 7r)). 

This proves (d), and hence the theorem. 

It follows from (d) and (48) that the curve y defined by 
(54) y(t) = E(it) (0 < t < 2 n) 

is a simple closed curve whose range is the unit circle in the plane. Since 
y'(t) = iE{it), the length of y is 

f \y'(t)\dt = 2n, 

J o 

by Theorem 6.27. This is of course the expected result for the circumference of 
a circle of radius 1. It shows that r, defined by (51), has the usual geometric 
significance. 

In the same way we see that the point y(t) describes a circular arc of length 
1 Q as t increases from 0 to t 0 . Consideration of the triangle whose vertices are 

Zl =0, z 2 = y(/ 0 ), Z 3 = C (to) 

shows that C{t) and S(t) are indeed identical with cos t and sin t , if the latter 
are defined in the usual way as ratios of the sides of a right triangle. 

It should be stressed that we derived the basic properties of the trigono- 
metric functions from (46) and (25), without any appeal to the geometric notion 
of angle. There are other nongeometric approaches to these functions. The 
papers by W. F. Eberlein (Amer. Math. Monthly , vol. 74, 1967, pp. 1223-1225) 
and by G. B. Robison {Math. Mag., vol. 41, 1968, pp. 66-70) deal with these 
topics. 


THE ALGEBRAIC COMPLETENESS OF THE COMPLEX FIELD 

We are now in a position to give a simple proof of the fact that the complex 
field is algebraically complete, that is to say, that every nonconstant polynomial 
with complex coefficients has a complex root. 

8.8 Theorem Suppose a 0 , ..., a n are complex numbers, n> \, a n ^0, 

P(z) = £a k z k . 

0 

Then P{z) = 0 for some complex number z. 

Proof Without loss of generality, assume a n = 1. Put 
(55) p = inf \P(z)\ (z complex) 

If |z| =R, then 

|P(Z)| >/?"[! - k-.l/r 1 - 


(56) 


••• - kltf-"]. 
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The right side of (56) tends to oo as R -► oo. Hence there exists R 0 such 
that |P(z)| > p if \z\ > R 0 . Since |/*| is continuous on the closed disc 
with center at 0 and radius R 0 , Theorem 4.16 shows that \P(z 0 )\ = p for 
some z 0 . 

We claim that p = 0. 

If not, put Q(z) = P(z 4- z 0 )/P(z 0 ). Then Q is a nonconstant poly- 
nomial, 0(0) = 1, and | Q(z ) | > 1 for all z. There is a smallest integer k , 
1 < k < n, such that 

(57) Q(z) = 1 ••• b k # 0. 

By Theorem 8.7(d) there is a real 0 such that 

(58) *'*% = -IM. 

If r > 0 and r*|6 k | < 1, (58) implies 

1 1 4- b k r k e ik9 \ = 1 - r k \b k \, 

so that 

| Q{re ie )\ < 1 — r k {\b k \ — r\b k + l | — • • * — r n ~ k \b n \). 

For sufficiently small r, the expression in braces is positive; hence 
| Q(re ,e ) \ < 1, a contradiction. 

Thus f.i = 0, that is, P(z 0 ) = 0. 

Exercise 27 contains a more general result. 


FOURIER SERIES 


8.9 Definition A trigonometric polynomial is a finite sum of the form 

N 

(59) f{x) = a 0 4 ^ ( a n cos nx -h b n sin nx) ( x real), 

n= 1 

where a 0 , . . . , a N , b x , . . . , b N are complex numbers. On account of the identities 
(46), (59) can also be written in the form 

(60) f(x) = £ c n e^ (x real), 

-N 

which is more convenient for most purposes. It is clear that every trigonometric 
polynomial is periodic, with period 2n. 

If n is a nonzero integer, e wx is the derivative of e mx /in , which also has 
period 2n. Hence 



dx = 


{j 


(if n = 0), 

(if n = + 1, +2, . . .). 


( 61 ) 
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Let us multiply (60) by e ,mx , where m is an integer; if we integrate the 
product, (61) shows that 

(62) c m = ^fj(x)e- imx dx 

for \m \ < N. If \m \ > N, the integral in (62) is 0. 

The following observation can be read off from (60) and (62): The 
trigonometric polynomial / given by (60), is real if and only if = c n for 
n = 0,...,N. 

In agreement with (60), we define a trigonometric series to be a series of 
the form 

(63) I c,e* x (x real); 

— 00 

the Mh partial sum of (63) is defined to be the right side of (60). 

If /is an integrable function on [ — n, 7r], the numbers c m defined by (62) 
for all integers m are called the Fourier coefficients of / and the series (63) formed 
with these coefficients is called the Fourier series of / 

The natural question which now arises is whether the Fourier series of / 
converges to / or, more generally, whether /is determined by its Fourier series. 
That is to say, if we know the Fourier coefficients of a function, can we find 
the function, and if so, how? 

The study of such series, and, in particular, the problem of representing a 
given function by a trigonometric series, originated in physical problems such 
as the theory of oscillations and the theory of heat conduction (Fourier's 
“Theorie analytique de la chaleur” was published in 1822). The many difficult 
and delicate problems which arose during this study caused a thorough revision 
and reformulation of the whole theory of functions of a real variable. Among 
many prominent names, those of Riemann, Cantor, and Lebesgue are intimately 
connected with this field, which nowadays, with all its generalizations and rami- 
fications, may well be said to occupy a central position in the whole of analysis. 

We shall be content to derive some basic theorems which are easily 
accessible by the methods developed in the preceding chapters. For more 
thorough investigations, the Lebesgue integral is a natural and indispensable 
tool. 

We shall first study more general systems of functions which share a 
property analogous to (61). 

8.10 Definition Let {</>„} (n = 1, 2, 3, . . .) be a sequence of complex functions 
on [a, b] 9 such that 

r b 

<A„(*)<M*) dx = 0 (n # m). 

J a 


( 64 ) 
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Then {</>„} is said to be an orthogonal system of functions on [ a , b]. If, in addition, 

(65) f\Ux)\ 2 dx = l 

" a 

for all /?, {</>„} is said to be orthonormal. 

For example, the functions (2n)~*e ,nx form an orthonormal system on 
[ — 7T, tt]. So do the real functions 

1 cos x sin x cos 2x sin 2x 
y/2n Jn Jn J n Jn 

If {</>„} is orthonormal on [< a , b ] and if 

(66) c„= ffitWJOclt (n = 1,2,3,...), 

J a 

we call c n the /7th Fourier coefficient of / relative to {</>„}. We write 

(67) f{x) ~ £ c„ 4>„(x) 

1 

and call this series the Fourier series of / (relative to {</>„}). 

Note that the symbol — used in (67) implies nothing about the conver- 
gence of the series; it merely says that the coefficients are given by (66). 

The following theorems show that the partial sums of the Fourier series 
of / have a certain minimum property. We shall assume here and in the rest of 
this chapter that / e although this hypothesis can be weakened. 

8.11 Theorem Let {</>„} be orthonormal on [a, h]. Let 

(68) s n (x) = £ c m cf) m (x ) 

m — 1 

be the nth partial sum of the Fourier series of f, and suppose 

(69) t n (x) = X y m 4>,„(x). 

m = I 

Then 

(70) \ \f~ s n \ 2 dx < \ \f- t n \ 2 dx, 

J a J a 

and equality holds if and only if 

(71) y m = c m (m = 1, n). 

That is to say, among all functions t n , s n gives the best possible mean 
square approximation to f 
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Proof Let J denote the integral over [« a , b ], I the sum from 1 to n. Then 
\fh = j/I = X C'm Vm 
by the definition of {c m }, 

/ U„i 2 = /'dn = /X 7 m 4>m X V/A = X I I’m 1 2 

since {</>,„} is orthonormal, and so 

J I/- d 2 =/ I/I 2 - J/?«- J//. + J Id 2 

J I/I X C " 1 1 m X d Vm "f" X Vm / m 

= | I/I 2 - X kml 2 + X km - C m| 2 . 

which is evidently minimized if and only if y m = c m . 

Putting y m = c m in this calculation, we obtain 

(72) f k (*)| 2 dx = X kml 2 < f l/X-v) 1 2 dx, 

* a 1 J a 

since ||/- ?„| 2 > 0. 

8.12 Theorem If {<p n } is orthonormal on [a, b]. and if 

f(x) ~ X C„ <f>„(x), 

n — 1 

then 

(73) Xkd 2 < [ |/(-v) 1 2 f/.v. 

11=1 ^ fl 

In particular , 

(74) lim c„ = 0. 

n~* oc 

Proof Letting /7 -► oo in (72), we obtain (73), the so-called “Bessel 
inequality.” 

8.13 Trigonometric series From nowon we shall deal only with the trigono- 
metric system. We shall consider functions / that have period 2n and that are 
Riemann-integrable on [ — n, n] (and hence on every bounded interval). The 
Fourier series of f is then the series (63) whose coefficients c n are given by the 
integrals (62), and 

(75) *«(*) = **(/:*) = ic n e inx 

— ,v 
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is the Mh partial sum of the Fourier series of /. The inequality (72) now takes 
the form 

(76) f f |%(*)| 2 dx = Z \c„\ 2 < f f |/(*)| 2 </*• 

271 J -It -N In J - n 

In order to obtain an expression for s N that is more manageable than (75) 
we introduce the Dirichlet kernel 


(77) 


D n {x)= £ e in * 

n — -N 


sin (N + j)x 
sin (x/2) 


The first of these equalities is the definition of D N (x). The second follows if 
both sides of the identity 

(e ix - l)D N (x) = e iiN + l)x - e' iNx 

are multiplied by e~ ,x/2 . 

By (62) and (75), we have 

s N (f ;x) = £2. f f(t)e~ in ‘ dt e‘" x 
-N 2n J- K 

1 r * N 

= Z- I /(Ol e ln < x -"dt, 

2nJ- n - N 


so that 


(78) s N (f; x) = f f(t)D„(x - t) dt = f /(.v - /)Z> V (0 

The periodicity of all functions involved shows that it is immaterial over which 
interval we integrate, as long as its length is In. This shows that the two integrals 
in (78) are equal. 

We shall prove just one theorem about the pointwise convergence of 
Fourier series. 


8.14 Theorem If, for some x , there are constants 6 > 0 and M < oo such that 

(79) |/(* +/)-/(*) | < M\t\ 
for all t e (-S, <5), then 

(80) lim s N (f; x) =f(x). 

N - 00 

Proof Define 


9 (‘) = 


f(x -t)- f(x) 


(81) 


sin ( t/2 ) 
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for 0 < 1 1 \ < n, and put g(0) = 0. By the definition (77), 

2 O s (x)dx=\. 

Hence (78) shows that 


s N (f-, X) -Ax) = ^ /_ 9(0 sin (n + ^ dt 

= 2- J [tf(0 cos sin Nt dt + ^ / ^ [tf(0 sin fj cos Nt dt. 


By (79) and (81), g(t) cos (//2) and g(t) sin (t/2) are bounded. The last 
two integrals thus tend to 0 as N -> oo, by (74). This proves (80). 


Corollary If f(x) = 0 for all x in some segment 7, then lim s N (f ; x) = 0 for 
every x e 7. 


Here is another formulation of this corollary: 

If f(t) = g(t) for all t in some neighborhood of x, then 

s N (f< x) - s N (g ; x) = s N (f- g ; x) -► 0 as N -► oo. 

This is usually called the localization theorem. It shows that the behavior 
of the sequence {s N (f ; jc)}, as far as convergence is concerned, depends only on 
the values of / in some (arbitrarily small) neighborhood of x. Two Fourier 
series may thus have the same behavior in one interval, but may behave in 
entirely different ways in some other interval. We have here a very striking 
contrast between Fourier series and power series (Theorem 8.5). 

We conclude with two other approximation theorems. 


8.15 Theorem Iff is continuous (with period In) and ife> 0, then there is a 
trigonometric polynomial P such that 


for all real x . 


I^C*) -Ax ) I < £ 


Proof If we identify jc and x -I- In, we may regard the 27r-periodic func- 
tions on R x as functions on the unit circle T, by means of the mapping 
x-+e tx . The trigonometric polynomials, i.e., the functions of the form 
(60), form a self-adjoint algebra j/, which separates points on T , and 
which vanishes at no point of T. Since T is compact, Theorem 7.33 tells 
us that srf is dense in %\T). This is exactly what the theorem asserts. 


A more precise form of this theorem appears in Exercise 15. 
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8.16 Parseval’s theorem Suppose f and g are Rieniann-integrable functions 
w ith period 2n, and 

OO 00 


(82) 

m 

~ X c„e inx , g(x) ~ X y * ei " x - 

- 00 - 00 

Then 



(83) 

lim i- | 

.v-« 2 n J 

1 fix) - s N (f ; x)| 2 dx = 0, 

- 71 

(84) 


1 r n x 

— f(x)g(x) dx = £c n y„, 
2 ti J - n - oc 

(85) 


T-f I/WI 2 dx = £ |c„f 2 

2k J - n - oo 

Proof 

Let us use the notation 

(86) 




Let r, > 0 be given. Since / e 3? and f(n) = /( — n) 9 the construction 
described in Exercise 12 of Chap. 6 yields a continuous 27r-periodic func- 
tion h with 

(87) II f-h\\ 2 <e. 

By Theorem 8.15, there is a trigonometric polynomial P such that 
| h{x) - Pf*) | < a for all x. Hence || h - P\\ 2 < a. If P has degree N 0 , 
Theorem 8.11 shows that 

(88) \\h-s N (h)\\ 2 <\\h-P\\ 2 <s 
for all N > N 0 . By (72), with h — f in place of f 

(89) | \s N (h) - s N (f ) || 2 = \\s N (h -f ) || 2 < ||/; -/|| 2 < a. 

Now the triangle inequality (Exercise 11, Chap. 6), combined with 
(87), (88), and (89), shows that 

(90) ll/-%(/)ll2<3£ (N > N 0 ). 

This proves (83). Next, 

(91) 2- f s N (f)g dx = Y. c* f e ‘" x 9(x) =Y. c n f„ , 

2n J -n -N J-n J -n -N 

and the Schwarz inequality shows that 

jf§ - js N (f)g\< j \f-s N {f)\\g\ < |/l/-Jvl 2 / M 2 ) > 


(92) 
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which tends to 0, as TV -► oo, by (83). Comparison of (91) and (92) gives 
(84). Finally, (85) is the special case g =f of (84). 

A more general version of Theorem 8.16 appears in Chap. 1 1. 


THE GAMMA FUNCTION 

This function is closely related to factorials and crops up in many unexpected 
places in analysis. Its origin, history, and development are very well described 
in an interesting article by P. J. Davis (Amer. Math. Monthly , vol. 66, 1959, 
pp. 849-869). Artin’s book (cited in the Bibliography) is another good elemen- 
tary introduction. 

Our presentation will be very condensed, with only a few comments after 
each theorem. This section may thus be regarded as a large exercise, and as an 
opportunity to apply some of the material that has been presented so far. 

8.17 Definition For 0 < x < oo, 

(93) T(x) = r t x ~ l e~ t dt. 

J o 

The integral converges for these x. (When .r < 1, both 0 and oo have to 
be looked at.) 

8.18 Theorem 

(a) The functional equation 

T(x + 1) = xT(x) 

holds if0<x< oo. 

( b ) T(a? -f 1) = n\ for n = 1, 2, 3, 

(c) log T is convex on (0, oc). 

Proof An integration by parts proves (a). Since T(l) = 1, (a) implies 
( h ), by induction. If 1 < p < oo and (1//?) + (\/q) = 1, apply Holder’s 
inequality (Exercise 10, Chap. 6) to (93), and obtain 

r(-+ -) < Uxy'T (y) 1,q . 

\P 9/ 

This is equivalent to (c). 

It is a rather surprising fact discovered by Bohr and Mollerup, that 
these three properties characterize T completely. 



SOME SPECIAL FUNCTIONS 193 


8.19 Theorem Iff is a positive function on (0, oo) such that 

(a) f(x+\) = xf( x), 

(b) /( 1) = 1, 

(c) log / is convex , 
then fix) = T(x). 


Proof Since T satisfies (a), ( b\ and (c), it is enough to prove that fix) is 
uniquely determined by (a), ( b ), (c), for all x > 0. By (a), it is enough to 
do this for x e (0, 1). 

Put cp = log/. Then 

(94) (p(x + 1) = (p(x ) + log x (0 < x < oo), 

<p(l) = 0, and (p is convex. Suppose 0 < x < 1, and n is a positive integer. 
By (94), (p{n + 1) = log(/i!). Consider the difference quotients of <p on the 
intervals [n, n + 1], [n + 1 , n + 1 + x], [n + 1, n + 2]. Since (p is convex 


log n < 


(pin + \ + x)- (pin + 1) 
x 


<, log in + 1). 


Repeated application of (94) gives 

(pin -f 1 + x) = (pix) + log [xix +!)■••(* + n)]. 


Thus 


0 £ - 108 [j(TT!p-(^] £ J “ og ( 1 + ^) ' 

The last expression tends to 0 as n -► oo. Hence (pix) is determined, and 
the proof is complete. 


As a by-product we obtain the relation 


(95) 


Tix) = lim 

n-+ oo 


n\n x 

xix + 1) • • • ix + n) 


at least when 0 < x < 1 ; from this one can deduce that (95) holds for all x > 0, 
since T(x + 1) = xrr(x). 


8.20 Theorem If x> 0 and y > 0, then 


( 96 ) 



IWQ) 
n> + y) ' 


This integral is the so-called beta function Bix, y ). 
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Proof Note that B(\,y) = l/y , that \og B(x, y) is a convex function of 
x, for each fixed y, by Holder’s inequality, as in Theorem 8.18, and that 

(97) B(x +l,y) = -^—B(x,y). 

x + y 

To prove (97), perform an integration by parts on 

B(x + \,y) = dt. 

These three properties of B(x, y) show, for each y, that Theorem 8.19 
applies to the function / defined by 


Hence f(x) = T(x). 


/w = 


r~(-y + y) 
r(y) 


B(x,y). 



8.22 Stirling’s formula This provides a simple approximate expression for 
r(x -f 1) when x is large (hence for n\ when n is large). The formula is 


lim 

x-» 00 


n* + i)_ 

(x/e) x Jinx 


( 103 ) 
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Here is a proof. Put t = x( 1 + u) in (93). This gives 

(104) r(x + 1) = r [(1 4- u)e~ u ] x du. 

J - i 

Determine h(u) so that h( 0) = 1 and 

(105) (1 + u)e~ u = exp y //(«)] 
if — 1 < u < oo, u ^ 0. Then 

2 

(106) h(u) = [w - log (1 4- u)\. 

It follows that h is continuous, and that h(u) decreases monotonically from oo 
to 0 as u increases from — 1 to oo. 

The substitution u = s yjljx turns (104) into 

(107) r(x + 1) = x x e~ x Jlx f i l/ x (s) ds 

^ - oo 

where 


l ( \ _ (exp [~s 2 h(s y/2Tx)] (-Jx/2 <j < oo), 

* x(S) (0 (s< -Jx/2). 

Note the following facts about ip x (s): 

(a) For every s , ip x (s) -> e~ s2 as x -> oo. 

(b) The convergence in (a) is uniform on [ — A, A], for every A < oo. 

(c) When s < 0, then 0 < \j/ x (s) < e~ s2 . 

(d) When s > 0 and x > 1, then 0 < *l/ x (s) < i/^Cs). 

(e) »Ai («) ds < oo. 

The convergence theorem stated in Exercise 12 of Chap. 7 can therefore 
be applied to the integral (107), and shows that this integral converges to ^/n 
as x-> oo, by (101). This proves (103). 

A more detailed version of this proof may be found in R. C. Buck’s 
“Advanced Calculus,” pp. 216-218. For two other, entirely different, proofs, 
see W. Feller’s article in Amer. Math. Monthly , vol. 74, 1967, pp. 1223-1225 
(with a correction in vol. 75, 1968, p. 518) and pp. 20-24 of Artin’s book. 

Exercise 20 gives a simpler proof of a less precise result. 
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EXERCISES 

1. Define 



(*# 0 ), 
(* = 0). 


Prove that / has derivatives of all orders at x = 0, and that /‘"’(0) = 0 for 
n= 1, 2, 3, — 

2. Let a,j be the number in the /th row and y'th column of the array 


-10 0 0 
i -1 0 0 

t 4—1 0 

i i 4—1 


so that 

t° 

o,j= -1 

\ 2 J-i 

Prove that 

- 2 , 

l J 

3. Prove that 

i j j i 


U <j), 
(i = j), 
(/•>/)■ 


= 0. 

J l 


if cj t j > 0 for all / and j (the case + oo = + oo may occur). 


4. Prove the following limit relations: 

(b> 0). 


b x — 1 

(a) lim = log b 

x-*0 X 


{b) lim lo eiLt JL ) =1 , 

X-+0 X 


(c) lim (1 + *) 1/x = e. 

x-0 


( d ) lim 


(‘ + 
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5. Find the following limits 


(a) lim 

x-*0 


g-(l +x) llx 

X 


(b) lim 

n-» oc 


log n 


[„!/«_ 


1 ]. 


(c) lim 


tan x — x 


-o x(l — cos x) * 


(d) lim 

jc-0 


x — sin x 
tan x — x * 


6 . Suppose f(x)f(y) = f(x-\-y) for all real x and y. 

(a) Assuming that / is differentiable and not zero, prove that 

f(x ) = e cx 


where c is a constant. 

( b ) Prove the same thing, assuming only that /is continuous. 

7. If 0 < x < ~ , prove that 


2 sin x 

- < < 1 . 

7 r x 

8. For n = 0, 1,2, . . . , and x real, prove that 

| sin nx\ < a? | sin x \ . 

Note that this inequality may be false for other values of n. For instance, 

| sin i7r| > i | sin tt\ . 

9. (a) Put s N = 1 + (i) + • • • 4 - (l/N). Prove that 

lim (s N — log N) 

N~* oo 

exists. (The limit, often denoted by y, is called Euler’s constant. Its numerical 

value is 0.5772 It is not known whether y is rational or not.) 

(b) Roughly how large must m be so that N = 10 m satisfies s» > 100? 

10. Prove that X Mp diverges; the sum extends over all primes. 

(This shows that the primes form a fairly substantial subset of the positive 
integers.) 
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Hint: Given N, let p u . Pk be those primes that divide at least one in- 
teger <N. Then 



k 2 

<expX — . 

Pj 


The last inequality holds because 

(1 — *) _1 < e 2x 


if 0 < a: < J. 

(There are many proofs of this result. See, for instance, the article by 
I. Niven in Amer. Math. Monthly , vol. 78, 1971, pp. 272-273, and the one by 
R. Bellman in Amer. Math. Monthly , vol. 50, 1943, pp. 318-319.) 

11. Suppose f e on [0, A] for all A < oo, and fix) -> 1 as x + oo. Prove that 


lim t\ e~ tx f(x)dx=\ (t > 0). 
r-0 J 0 


12. Suppose 0 < 8 < rr,f(x) = 1 if | x\ <8,/(x) = 0 if 8 < \x\ <n, and f(x + 2 tt) = 
f(x) for all x. 

( a ) Compute the Fourier coefficients of /. 

( b ) Conclude that 


^ sin (n8) n — 8 
n = i n “ 2 


(0 < 8 < 7 r). 


(c) Deduce from Parseval’s theorem that 


® sin 2 (wS) 7T — 8 

n = 1 « 2 8 “ 2 ~ * 


(d) Let 8 -> 0 and prove that 



( e ) Put 8 = 7 t/ 2 in (c). What do you get ? 

13. Put f{x) = x if 0 < x < 27t, and apply Parseval’s theorem to conclude that 
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14. If f(x) = (7 r— \x\) 2 on [ — 7 r, 7 r], prove that 

7T 2 00 4 

fix) = — + x; — cos nx 

J n = 1 /I 


and deduce that 


y J_ = 2^ y L = — 

•tl n 1 6 ’ ,4-, n* 90 ' 

(A recent article by E. L. Stark contains many references to series of the form 
X«" s , where s is a positive integer. See Math. Mag., vol. 47, 1974, pp. 197-202.) 

15. With D„ as defined in (77), put 


K n (x) = 


1 

JV+ 1 


X />.(*). 

n = 0 


Prove that 




1 1 — cos (TV + 1 )x 

N + 1 1 — cos x 


and that 

(а) K n >0 , 

(б) ^ j" K N (x)dx = l , 


If s N = j n (/; x) is the Mh partial sum of the Fourier series of /, consider 
the arithmetic means 


CTN 


^0 + Si ' + s N 

N+ 1 


Prove that 


>»(/; x) = 2- J fix - t)K N (t) dt. 


and hence prove Fejer’s theorem: 

If f is continuous , with period 2n, then a N (/; x) ->/(*) uniformly on [—7 r, 77]. 
Use properties (a), (b), (c) to proceed as in Theorem 7.26. 

16. Prove a pointwise version of Fejer’s theorem: 

Iffe and f(x -b), fix — ) exist for some x , then 

lim <7 „(/; x) = 4t/(jr +) +/(*—)]• 

N 00 
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17 . Assume /is bounded and monotonic on [—7 r, 77), with Fourier coefficients c„, as 
given by (62). 

(a) Use Exercise 17 of Chap. 6 to prove that {nc„} is a bounded sequence. 

(b) Combine (a) with Exercise 16 and with Exercise 14(e) of Chap. 3, to conclude 
that 

lim s N (f ; x) = *[/(*+) +/(*—)] 

N-* 00 

for every x. 

(c) Assume only that / e $ on [—7 7, 77] and that / is monotonic in some segment 
(<*, p)<= [— 77 , 77]. Prove that the conclusion of ( b ) holds for every x e (a, /3). 

(This is an application of the localization theorem.) 

18 . Define 

f(x) = x 3 — sin 2 x tan x 
g(x) = 2x 2 — sin 2 x — x tan x. 

Find out, for each of these two functions, whether it is positive or negative for all 
x e (0, 77/2), or whether it changes sign. Prove your answer. 

19. Suppose / is a continuous function on R\f{x + Iv) = /(*), and a/77 is irrational. 
Prove that 


lim 4 £ /(-* + no) = / [ f(t)dt 

N— 00 jY n = 1 Z 77 J — n 

for every x. Hint: Do it first for f(x) = e ikx . 

20. The following simple computation yields a good approximation to Stirling’s 
formula. 

For m= 1, 2, 3, ... , define 

/( x) = (m + 1 — x) log m + (x — m) log ( m + 1) 
if m < x < m -F 1, and define 


x 

g(x) = 1 f log m 

m 

if am — k <x < m + \. Draw the graphs of / and g. Note that f(x) < log x <g(x) 
if x > 1 and that 



dx — log («!) — i log n > 



g(x) dx. 


Integrate log x over [1, n]. Conclude that 

| < log (/?!) — (n + £) log n + n < 1 
for n = 2, 3, 4, {Note: log V2tt ~ 0.918 . . . .) Thus 


(nle) n Vn 


< e. 
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21. Let 


U-±f ( " « 1,2,3,....). 


Prove that there exists a constant C > 0 such that 

L„> C log n (n= 1, 2, 3, . ..), 
or, more precisely, that the sequence 


I, 4 \ 

L r log n 

\ w / 


is bounded. 

22. If a is real and - 1 < x < 1, prove Newton’s binomial theorem 

n +.vr= 1 + 1 

n= 1 W; 

////if : Denote the right side by fix). Prove that the series converges. Prove that 

(1 +x)f\x) = <xf(x) 

and solve this differential equation. 

Show also that 


u-.v)-“ = i 

n = 0 


r(/l + a) „ 

x n 

n\ TM 


if - 1 < x < 1 and a > 0. 

23. Let y be a continuously differentiable closed curve in the complex plane, with 
parameter interval [ a , />], and assume that y(t) ^ 0 for every t e [ a , b]. Define the 
index of y to be 



yV) 

y(f) 


dt. 


Prove that Ind (y) is always an integer. 

Hint: There exists 9 0 on [ a , b] with <p' = y'/y, <p(a) = 0. Hence y exp(— 9) 
is constant. Since y(a) = y{b) it follows that exp y(b) = exp (p(a) = 1 . Note that 
(p(b) = 2777 Ind (y).- 

Compute Ind (y) when y(t) = e in \ a = 0, b = 2n. 

Explain why Ind (y) is often called the winding number of y around 0. 

24. Let y be as in Exercise 23, and assume in addition that the range of y does not 
intersect the negative real axis. Prove that lnd(y)=0. Hint: For 0<c<oo, 
Ind (y 4- c) is a continuous integer-valued function of c. Also, Ind (y + c) ->0 
as c-> 00. 
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25 . Suppose y t and y 2 are curves as in Exercise 23, and 

|yi(0 - y 2 (0| < |yi(0| (a-^t<b). 

Prove that Ind (y0 = Ind (y 2 ). 

Hint: Put y = y 2 lyi. Then 1 1 - y | <1, hence Ind (y) = 0, by Exercise 24. 

Also, 

/ = y±_ y± 
y y* y i * 

26 . Let y be a dosed curve in the complex plane (not necessarily differentiable) with 
parameter interval [0, 2tt], such that y(t) # 0 for every t e [0, 2tt]. 

Choose 8 > 0 so that |y(/)| > 8 for all t e [0, 2tt]. If P y and P 2 are trigo- 
nometric polynomials such that \Pj(t) — y{t) | < 8/4 for all t e [0, 2n] (their exis- 
tence is assured by Theorem 8.15), prove that 

Ind (/>,)= Ind (P 2 ) 

by applying Exercise 25. 

Define this common value to be Ind (y). 

Prove that the statements of Exercises 24 and 25 hold without any differenti- 
ability assumption. 

27 . Let / be a continuous complex function defined in the complex plane. Suppose 
there is a positive integer n and a complex number c ^ 0 such that 

lim z“"/(z) = c. 

|z| -* CO 

Prove that f(z) = 0 for at least one complex number z. 

Note that this is a generalization of Theorem 8.8. 

Hint: Assume /(z) ^ 0 for all z, define 

yM=f(re “) 

for 0 < r < oc, 0 <t< 2i r, and prove the following statements about the curves 
Yr- 

(a) Ind (y 0 ) = 0. 

( b ) Ind (y r ) = n for all sufficiently large r. 

( c ) Ind (y r ) is a continuous function of r, on [0, x). 

[In ( b ) and (c), use the last part of Exercise 26.] 

Show that (a), ( b ), and (c) are contradictory, since n > 0. 

28 . Let D be the closed unit disc in the complex plane. (Thus z e D if and only if 
\z\ < 1.) Let g be a continuous mapping of D into the unit circle T. (Thus, 
\g(z)\ = 1 for every z e D.) 

Prove that g(z) = —z for at least one z e T. 

Hint: For 0<r<l,0</< 27 r, put 

y r (t) = g(re lt \ 

and put ip(t)= e~ u yi(t). If g(z) ^ — z for every z e T, then ip(t) ^ — 1 for every 
t e [0, 27t]. Hence Ind (ip) = 0, by Exercises 24 and 26. It follows that Ind (y0 = 1. 
But Ind (y 0 ) = 0. Derive a contradiction, as in Exercise 27. 
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29. Prove that every continuous mapping/ of D into D has a fixed point in D. 

(This is the 2-dimensional case of Brouwer’s fixed-point theorem.) 

Hint: Assume f(z) f z for every z e D. Associate to each z e D the point 
g(z) e T which lies on the ray that starts at f(z) and passes through z. Then g 
maps D into T, g(z) = z if z eT, and g is continuous, because 

g (z) = z- s(z)[f(z) - z], 

where s(z) is the unique nonnegative root of a certain quadratic equation whose 
coefficients are continuous functions of / and z. Apply Exercise 28. 
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FUNCTIONS OF SEVERAL VARIABLES 


LINEAR TRANSFORMATIONS 

We begin this chapter with a discussion of sets of vectors in euclidean tf-space R n . 
The algebraic facts presented here extend without change to finite-dimensional 
vector spaces over any field of scalars. However, for our purposes it is quite 
sufficient to stay within the familiar framework provided by the euclidean spaces. 

9.1 Definitions 

(a) A nonempty set X c: R n is a vector space if x + y e X and cxe X 
for all x e X, y e X, and for all scalars c. 

(b) If x x k e R n and c 1? . . . , c k are scalars, the vector 

c l x i + • • • + c k x k 

is called a linear combination of x 1? . . . , x k . If 5 c= R n and if E is the set 
of all linear combinations of elements of 5, we say that S spans ZT, or that 
E is the span of S. 

Observe that every span is a vector space. 
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(c) A set consisting of vectors x 1? ..., x* (we shall use the notation 
{x l9 . . . ,x k } for such a set) is said to be independent if the relation 
C 1 X 1 -f • • • 4- c k x k = 0 implies that c x = • • • = c k = 0. Otherwise {x^ . . . , x k } 
is said to be dependent. 

Observe that no independent set contains the null vector. 

(d) If a vector space X contains an independent set of r vectors but con- 
tains no independent set of r -I- 1 vectors, we say that X has dimension r, 
and write: dim X = r. 

The set consisting of 0 alone is a vector space; its dimension is 0. 

(e) An independent subset of a vector space X which spans X is called 
a das is of X. 

Observe that if B = (xj, . . . , x r ) is a basis of X , then every x e X 
has a unique representation of the form x = XcyX,-. Such a representation 
exists since B spans X. and it is unique since B is independent. The 
numbers c,, ..., c r are called the coordinates of x with respect to the 
basis B. 

The most familiar example of a basis is the set {e { e„), where 

tj is the vector in R n whose /th coordinate is 1 and whose other coordinates 
are all 0. If x e /?", x -= (x x x„), then x = I xfij. We shall call 

{e,, e„} 

the standard basis of R n . 

9.2 Theorem Let r be a positive integer. If a vector space X is spanned by a 
set of r vectors , then dim X < r. 

Proof If this is false, there is a vector space X which contains an inde- 
pendent set Q = {y , , . . . , y r + x } and which is spanned by a set S 0 consisting 
of r vectors. 

Suppose 0 < / < r, and suppose a set -S', has been constructed which 
spans X and which consists of all y j with 1 < j < i plus a certain collection 
of r — i members of S 0 , say x^ . . . , x r _ f . (In other words, S, is obtained 
from So by replacing / of its elements by members of Q , without altering 
the span.) Since S, spans X , y /+1 is in the span of 5,; hence there are 
scalars a ls . . . , a i+ 1? b x 6 r _ f , with a i+ { = 1 , such that 

/+ 1 r-i 

I Ojyj + X b k x k =0. 

j = 1 k= 1 

If all b k s were 0, the independence of Q would force all a/s to be 0, a 
contradiction. It follows that some x k e 5, is a linear combination of the 
other members of T { = S, u {y 1 + 1 }. Remove this x k from T { and call the 
remaining set S i+ \. Then S l + 1 spans the same set as T t , namely X , so 
that S l + 1 has the properties postulated for S { with / - hi in place of /. 
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Starting with S 0 , we thus construct sets S { S r . The last of 

these consists of y x y r , and our construction shows that it spans X. 

But Q is independent; hence y r + 1 is not in the span of S r . This contra- 
diction establishes the theorem. 

Corollary dim R n = n. 

Proof Since {e^ . . . , e„} spans /?", the theorem shows that dim R n < n. 
Since {ej, . . . , e„} is independent, dim R n > n. 

9.3 Theorem Suppose X is a vector space , and dim X = n. 

(a) A set E of n vectors in X spans X if and only if E is independent. 

(b) X has a basis , and every basis consists of n vectors. 

(c) 7/1 < r < n and {Vj y r ) is an independent set in X , then X has a 

basis containing {yj, . . . , y r }. 

Proof Suppose E = (x,, x„). Since dim X = n. the set {x, x„. y} 

is dependent, for every y e X. If £ is independent, it follows that y is in 
the span of £; hence £ spans X. Conversely, if £ is dependent, one of its 
members can be removed without changing the span of E. Hence £ 
cannot span A', by Theorem 9.2. This proves {a). 

Since dim X = /?, X contains an independent set of n vectors, and 
(< a ) shows that every such set is a basis of X: (b) now follows from 9.1(d) 
and 9.2. 

To prove (c), let (x^ ... , x„} be a basis of X. The set 

•S = {y,....,y r ,x, x„) 

spans X and is dependent, since it contains more than n vectors. The 
argument used in the proof of Theorem 9.2 shows that one of the x ( 's is 
a linear combination of the other members of S. If we remove this x, from 
5, the remaining set still spans X. This process can be repeated r times 
and leads to a basis of X which contains {y^ . . . , y r }, by (a). 

9.4 Definitions A mapping A of a vector space X into a vector space Y is said 
to be a linear transformation if 


A(\ { + x 2 ) = Ax t + Ax 2 , A(cx) = cAx 

for all x, x t , x 2 e X and all scalars c. Note that one often writes Ax instead 
of A(x) if A is linear. 

Observe that AO = 0 if A is linear. Observe also that a linear transforma- 
tion A of X into Y is completely determined by its action on any basis: If 
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(xx, . . . , xj is a basis of X, then every x g X has a unique representation of the 
form 

n 

* = E C * X <- 
1=1 

and the linearity of A allows us to compute Ax from the vectors Ax l9 . . . , Ax n 
and the coordinates c l9 . . . , c n by the formula 

n 

Ax =£c i Ax i . 

i= 1 

Linear transformations of X into X are often called linear operators on X. 
If A is a linear operator on X which (i) is one-to-one and (ii) maps X onto 
X, we say that A is invertible . In this case we can define an operator A' 1 on X 
by requiring that A" 1 (Ax) = x for all xel It is trivial to verify that we then 
also have A(A~ l x) = x, for all xe X, and that A" 1 is linear. 

An important fact about linear operators on finite-dimensional vector 
spaces is that each of the above conditions (i) and (ii) implies the other: 

9.5 Theorem A linear operator A on a finite-dimensional vector space X is 
one-to-one if and only if the range of A is all of X, 

Proof Let {x l9 . . . , xj be a basis of X. The linearity of A shows that 
its range 01(A) is the span of the set Q ={Ax t , . .., Ax n }. We therefore 
infer from Theorem 9.3(a) that 01(A) = X if and only if Q is independent 
We have to prove that this happens if and only if A is one-to-one. 

Suppose A is one-to-one and SCf/tx* = 0. Then A^Lc^) = 0, hence 
X^x,- = 0, hence c t = • • • = c n = 0, and we conclude that Q is independent. 

Conversely, suppose Q is independent and ^(ScjX,) =0. Then 
HCiAXi =0, hence c 1 = = c n = 0, and we conclude: Ax =0 only if 

x = 0. If now Ax = Ay, then A(x — y) = Ax — Ay = 0, so that x — y = 0, 
and this says that A is one-to-one. 

9.6 Definitions 

(a) Let L( X, Y) be the set of all linear transformations of the vector space 
X into the vector space 7. Instead of L( X, X), we shall simply write L(X). 
If A u A 2 e L(X, Y) and if c lf c 2 are scalars, define c 1 A 1 + c 2 A 2 by 

(c^i + c 2 A 2 )x = c^A^x + c 2 A 2 x (x g X). 

It is then clear that c i A l + c 2 A 2 eL(X, Y ). 

(b) If X, Y, Z are vector spaces, and if A e L(X, Y) and B eL(Y, Z), we 
define their product BA to be the composition of A and B: 

(BA)x = B(Ax) (x g X). 


Then BA e L( X, Z). 
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Note that BA need not be the same as AB, even if X = Y = Z. 

(c) For A gL(/?", R m ), define the norm || A || of A to be the sup of all 
numbers \Ax\, where x ranges over all vectors in R n with |x| < 1. 
Observe that the inequality 

M*l < Mil l x l 

holds for all x e R". Also, if / is such that \Ax\ < A|x| for all x e R n , 
then || >4 1| < L 

9.7 Theorem 

(a) If AeL(R n , R m ), then \\A\\ < oo and A is a uniformly continuous 
mapping of R n into R m . 

(b) If A, Be L(R n , R m ) and c is a scalar , then 

\\A + B\\ < MU + ||£||, \\cA\\ = \ c\ Mil. 

With the distance between A and B defined as || A - 5||, L(R'\ R m ) is a 
metric space. 

(c) If A e L{R\ R m ) and B e L(R m , R k ), then 

\\BA\\ <\\B\\ Mil- 

Proof 

(a) Let {e A e„} be the standard basis in R n and suppose x =Y.c i e i , 

| x | < 1, so that |c,-| < 1 for / = 1 //. Then 

\Ax\ =|Xt'Me,| <1 k-| Me, | Me, | 

so that 

Mil < Z Me, I < oo. 

i = 1 

Since | Ax - Ay \ < \\A || | x - y | if x, y e R!\ we see that A is uniformly 
continuous. 

( b ) The inequality in (b) follows from 

| (A + B)x\ = I Ax + Bx | <\Ax\ + |flx| < (\\A\\ -f ||B||) |x|. 

The second part of (b) is proved in the same manner. If 

A , B , CeL(R\ R m ), 


we have the triangle inequality 

M - Cll = i! (A - B) + (B — C)|| < \\A - B\\ 4- 11^ - C||, 
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and it is easily verified that | \A — B\\ has the other properties of a metric 
(Definition 2.15). 

(c) Finally, (c) follows from 

\(BA)x\ = \B{Ax)I < \\B\\ \Ax\ < \\B\\ ||/4|| | x |- 
Since we now have metrics in the spaces L(R'\ R m ), the concepts of open 
set. continuity, etc., make sense for these spaces. Our next theorem utilizes 
these concepts. 

9.8 Theorem Let ft be the set of all invertible linear operators on R n . 

(a) If A Ed. BeL(R"). and 

\\ b-a i • M" Mi < 1, 

then Bed. 

( b ) ft is an open subset of L(R n ), and the mapping A -* A~ l is continuous 
on Q. 

(This mapping is also obviously a 1 — 1 mapping of ft onto ft, 
which is its own inverse.) 

Proof 

(c/) Put |i A~ 1 1| = 1 /a, put |j B — A || = [L Then /? < a. For every x e R'\ 

x\x \ = x\A~ 1 Ax\ < x\\A ~ Ml ' \Ax | 

= |/4x| < \(A — £)x| + |£x| < P\x\ + \Bxl 

so that 


0 ) (X ~ ft)\x \ < \Bx\ (X E /?"). 

Since a — /? > 0. ( 1 ) shows that Bx ^ 0 if x ^ 0. Hence B is 1 — 1 . 
By Theorem 9.5. Bed. This holds for all B with \\B — A\\ < a. Thus 
we have {a) and the fact that ft is open. 

(b) Next, replace x b> B~ ] y in (1). The resulting inequality 

(2) (2 — P)\£~ l y\ ^ |fifi~ 1 y| = 1 y | (y e /?") 

shows that ||fi~ 1 1| < (x — /?)" ’. The identity 

B~ l - A~ x = B~'(A - B)A ~ 1 , 

combined with Theorem 9.7(c), implies therefore that 

\\B-' -A-'\\ < Hfi-’ll M - fill l|/r‘|| <— -^-Tr- 
od* - P) 

This establishes the continuity assertion made in ( b ), since P -> 0 as fi -► A. 
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9.9 Matrices Suppose {Xj, . . . , x„} and {y j , . . . , y m } are bases of vector spaces 
X and Y , respectively. Then every A e L( X, Y) determines a set of numbers 
ciij such that 

m 

(3) AXj = X a^y, ( \<j<n ). 

i= 1 

It is convenient to visualize these numbers in a rectangular array of m rows 
and n columns, cal Jed an m by n matrix : 

a \ 1 a \2 
d 21 Cl 22 


_ Cl m 1 CJ m 2 

Observe that the coordinates a tj of the vector Axj (with respect to the basis 

{yi y m }) appear in the /th column of [A]. The vectors Ax y are therefore 

sometimes called the column vectors of [A]. With this terminology, the range 
of A is spanned by the column vectors of [A ]. 

If x =Y.CjXj. the linearity of A , combined with (3). shows that 

in . n \ 

(4) =X I a U ( 'j\ 

i - 1 \ j = 1 / 

Thus the coordinates of Ax are I tj a lj c l . Note that in (3) the summation 
ranges over the first subscript of a ij% but that we sum over the second subscript 
when computing coordinates. 

Suppose next that an m by n matrix is given, with real entries a ir If A is 
then defined by (4), it is clear that A e L( X, Y) and that [A] is the given matrix. 
Thus there is a natural* 1-1 correspondence between L(T, >') and the set of all 
real m by n matrices. We emphasize, though, that [A] depends not only on A 
but also on the choice of bases in X and Y. The same A may give rise to many 
different matrices if we change bases, and vice versa. We shall not pursue this 
observation any further, since we shall usually work with fixed bases. (Some 
remarks on this may be found in Sec. 9.37.) 

If Z is a third vector space, with basis (z, z p }. if A is given by (3), 

and if 

By, = X b k , z k , (BA)\j = X C kj z k , 

k k 

then A e L(X, T), B e L(Y.Z), BA e L( X, Z), and since 
B(Axj) = B X a u y, = X dij By, 

i i 

= X a u X hi ** = X (X b ki «oj z k • 
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the independence of (z x , . . . , z p } implies that 

(5) c kJ =Y l b ki a iJ (1 < k < p, 1 < j < ri). 

i 


This shows how to compute the p by n matrix [BA] from [5] and [A], If we 
define the product [£][/<], to be [BA], then (5) describes the usual rule of matrix 
multiplication. 

Finally, suppose {x l9 . . . , x w } and {y j, . . . , y m } are standard bases of R n and 
R m , and A is given by (4). The Schwarz inequality shows that 


Mx | 2 = 


Z (I a u c f £ I (I a fj ■ I c jj = X a u I x 1 2 


Thus 

(6) MII<(X4| 1 ' 2 - 

If we apply (6) to B — A in place of A , where A . B e L(R n , R m ), we see 
that if the matrix elements a u are continuous functions of a parameter, then the 
same is true of A. More precisely: 


If S is a metric space , if a xu . . . , a mn arc real continuous functions on S , 
and if for each p e 5, A p is the linear transformation of R n into R m whose matrix 
has entries a^fp), then the mapping p -► A p is a continuous mapping of S into 
L(R n , R m ). 


DIFFERENTIATION 

9.10 Preliminaries In order to arrive at a definition of the derivative of a 
function whose domain is R n (or an open subset of /?"), let us take another look 
at the familiar case // = 1, and let us see how to interpret the derivative in that 
case in a way which will naturally extend to n > 1. 

If/ is a real function with domain ( g , b) a R l and if a ' g ( a , b ), then f\x) 
is usually defined to be the real number 


( 7 ) 


r fix + h) -f(x) 

lim 7 , 

* — o h 


provided, of course, that this limit exists. Thus 

(8) f(x + h) -f{x) =f\x)h + r{h) 

where the “remainder” r(h) is small, in the sense that 
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Note that (8) expresses the difference f(x + h) —f(x) as the sum of the 
linear function that takes h to f\x)h , plus a small remainder. 

We can therefore regard the derivative of / at x, not as a real number, 
but as the linear operator on R l that takes h to f\x)h. 

[Observe that every real number a gives rise to a linear operator on R l ; 
the operator in question is simply multiplication by a. Conversely, every linear 
function that carries R l to R l is multiplication by some real number. It is this 
natural 1-1 correspondence between R l and L(R l ) which motivates the pre- 
ceding statements.] 

Let us next consider a function f that maps ( a , b) <= R l into R m . In that 
case, f '(*) was defined to be that vector y e R m (if there is one) for which 


( 10 ) 


lim 

/»-► 0 


j f(x + h) - f(x) _ y j = 0 


We can again rewrite this in the form 

(11) f (x + h) - f(x) = hy + r (/;), 

where r(h)/h^0 as /i — ► 0. The main term on the right side of (11) is again a 
linear function of h. Every y e R m induces a linear transformation of R ] into 
R m , by associating to each h e R ] the vector hy e R m . This identification of R m 
with L(R l , R m ) allows us to regard f \x) as a member of L(R\ R m ). 

Thus, if f is a differentiable mapping of ( a , b) a R l into R m , and if a e ( a , b), 
i hen f'(^) is the linear transformation of R l into R m that satisfies 


( 12 ) 


.. f(x + h)-f(x)-f'(x)h 

hm 

/i-o h 


or, equivalently, 
(13) 


Hm \t{x + h)-f(x)-fXx)h\ 


= 0 . 


We are now ready for the case n > 1. 


9.11 Definition Suppose E is an open set in R n , f maps E into R m , and x e E. 
If there exists a linear transformation A of R n into R m such that 


(14) 


lim 

h-0 


f(x + h) - f(x) - AYi\ 

IM 


= 0 , 


then we say that f is differentiable at x, and we write 


(15) 


f'(x)=A. 


Iff is differentiable at every x e E, we say that f is differentiable in E. 
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It is of course understood in (14) that he/?". If |h| is small enough, then 
x + h e £, since E is open. Thus f(x + h) is defined, f(x + h) e R m , and since 
AeL(R\ R m ), Ahe R m . Thus 

f(x + h) -f(x) - /Ihe R m . 

The norm in the numerator of (14) is that of R m . In the denominator we have 
the /T-norm of h. 

There is an obvious uniqueness problem which has to be settled before 
we go any further. 


9.12 Theorem Suppose E and f are as in Definition 9. 1 1, xe£, and (14) holds 
with A = A x and with A = A 2 . Then A x = A 2 . 


(16) 


Proof If B = A ! — A 2 , the inequality 

| £h| < | f (x + h) - f(x) - A x h\ + | f (x + h) — f(x) — A 2 h\ 
shows that | Z?h|/|h| -> 0 as h -► 0. For fixed h # 0, it follows that 
\B(th) 


Uhl 


0 


as 


► 0 . 


The linearity of B shows that the left side of (16) is independent of t. 
Thus £h =0 for every h e R n . Hence B = 0. 


9.13 Remarks 


(a) The relation (14) can be rewritten in the form 
(17) f(x + h) - f(x) = f (x)h + r(h) 

where the remainder r(h) satisfies 


(18) 


lim 

h -*0 


|r(h)| 


= 0 . 


We may interpret (17), as in Sec. 9.10, by saying that for fixed x and small 
h, the left side of (17) is approximately equal to f'(x)h, that is, to the value 
of a linear transformation applied to h. 

( b ) Suppose f and E are as in Definition 9.1 1, and f is differentiable in E. 
For every xe£, f '(*) is then a function, namely, a linear transformation 
of R n into R m . But f' is also a function: f' maps E into L(R n , R m ). 

(c) A glance at (17) shows that f is continuous at any point at which f is 
differentiable. 

(d) The derivative defined by (14) or (17) is often called the differential 
of f at x, or the total derivative of f at x, to distinguish it from the partial 
derivatives that will occur later. 
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9.14 Example We have defined derivatives of functions carrying /?" to R m to 
be linear transformations of R n into R m . What is the derivative of such a linear 
transformation? The answer is very simple. 

If A e L{R n , R m ) and if xe R n , then 

(19) A'(x)=A. 

Note that x appears on the left side of (19), but not on the right. Both 
sides of (19) are members of L(R n , R m ), whereas Ax e R m . 

The proof of (19) is a triviality, since 

(20) A(x + h) — Ax = A h, 

by the linearity of A. With f(x) = Ax, the numerator in (14) is thus 0 for every 
he/?". In (17), r(h) =0. 

We now extend the chain rule (Theorem 5.5) to the present situation. 

9.15 Theorem Suppose E is an open set in R n , f maps E into R m , f is differentiable 
at x 0 e E, g maps an open set containing f(E) into R k , and g is differentiable at 
f (x 0 ). Then the mapping F of E into R k defined by 

F(x) =g(f(x)) 

is differentiable at x 0 , and 

(21) F'(x 0 ) =g # (f(x 0 ))f'(xo). 

On the right side of (21), we have the product of two linear transforma- 
tions, as defined in Sec. 9.6. 

Proof Put y 0 = f(x 0 ), A = f '(x 0 ), B = g'(y 0 ), and define 
u(h) =f(x 0 T h) - f(x 0 ) - Ah, 
v(k) = g(y 0 + k) - g(y 0 ) - Bk, 

for all he/?" and k e R m for which f(x 0 4- h) and g(y 0 4- k) are defined. 
Then 

(22) | u(h) | =e(h)|h|, |v(k)| =iy(k)|k|, 

where e(h) -> 0 as h -> 0 and rj( k) -> 0 as k -► 0. 

Given h, put k = f(x 0 4- h) — f(x 0 ). Then 

(23) | k | = |/4h + u(h)| < [M|| 4- e(h)] |h|, 

F(x 0 4 - h) - F(x 0 ) - BAh = g(y 0 4 - k) - g(y 0 ) - BAh 

= B { k - Ah) 4- v(k) 

= ^u(h) + v(k). 


and 
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Hence (22) and (23) imply, for h ^ 0, that 

|F(» 0+ h)-F(x„)-iMh| s m <h) + [M|| + 8(h)] , (k) 

Let h -> 0. Then e(h) -> 0. Also, k -► 0, by (23), so that ^(k) -► 0. 
It follows that F'(x 0 ) = BA , which is what (21) asserts. 


9.16 Partial derivatives We again consider a function f that maps an open 
set E c= R n into R m . Let {e 1? . . . , e„} and {u l5 . . . , u m } be the standard bases of 
R n and R m . The components of f are the real functions f l9 . . .,f m defined by 


(24) 


m 


f(x) = X /i( X )“i 

i = 1 


(x e E), 


or, equivalently, by /,(x) = f(x) • u, , 1 < i < m. 
For x e E, 1 < i <m, 1 <j < n, we define 


(25) 


(Djf,)(x) = lim 

r-0 


fi(x + tej) -fi(x) 

> 

t 


provided the limit exists. Writing . . . , x n ) in place of /,(x), we see that 
Djf is the derivative of f with respect to Xj , keeping the other variables fixed. 
The notation 


(26) 


d A 

dxj 


is therefore often used in place of Djf iy and Djfi is called a partial derivative. 

In many cases where the existence of a derivative is sufficient when dealing 
with functions of one variable, continuity or at least boundedness of the partial 
derivatives is needed for functions of several variables. For example, the 
functions / and g described in Exercise 7, Chap. 4, are not continuous, although 
their partial derivatives exist at every point of R 2 . Even for continuous functions, 
the existence of all partial derivatives does not imply differentiability in the sense 
of Definition 9.1 1 ; see Exercises 6 and 14, and Theorem 9.21. 

However, if f is known to be differentiable at a point x, then its partial 
derivatives exist at x, and they determine the linear transformation f'(x) 
completely: 


9.17 Theorem Suppose f maps an open set into R m , and f is differentiable 

at a point x e E. Then the partial derivatives exist , and 

f '(x)e, = £ ( Djfi)(x) u j (1 <j^n). 

i= 1 


( 27 ) 
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Here, as in Sec. 9.16, {e^ . . . , e„} and {u h . . . , u m ) are the standard bases 
of R n and R m . 


(28) 


Proof Fix j. Since f is differentiable at x, 

f(x + tej) - f(x) = f'(x)(tej) + r(tej) 

where | r(/e y ) \/t -* 0 as /->(). The linearity of f'(x) shows therefore that 
f (x + tej) - f(x) 


lim 

t-*0 


= f'(x)e y . 


(29) 


If we now represent f in terms of its components, as in (24), then (28) 
becomes 

Um I m + 'f ~ /lW = f '(«)., ■ 

t-+ 0 i= 1 t 

It follows that each quotient in this sum has a limit, as / -> 0 (see Theorem 
4.10), so that each (D 7 /))(x) exists, and then (27) follows from (29). 


Here are some consequences of Theorem 9.17: 

Let [f '(*)] be the matrix that represents f '(x) with respect to our standard 
bases, as in Sec. 9.9. 


Then f X x )e 7 is the y'th column vector of [f'(x)], and (27) shows therefore 
that the number (^/ £ )(x) occupies the spot in the zth row and yth column of 
[f (x)]. Thus 


(f'(x)] = 


‘(£,/l)(x) • 

• (DJ ,)(x)- 




If h = 'Zhjtj is any vector in /?", then (27) implies that 

(30) f '(x)h = £ | f (/y;xx)/i y j u f . 


9.18 Example Let y be a differentiable mapping of the segment ( a , b) <= R 1 
into an open set E a R ", in other words, y is a differentiable curve in E. Let / 
be a real-valued differentiable function with domain E. Thus / is a differentiable 
mapping of E into R 1 . Define 

(31) 9(0=f(y(t)) (a < t < b). 

The chain rule asserts then that 


(32) 


=f\y(t))y\t) (< a<t< b). 
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Since y\t) e L(R\ R n ) and e L(/?\ 7? 1 ), (32) defines g'(t) as a linear 

operator on R x . This agrees with the fact that g maps ( a . b) into R x . However. 
g\t) can also be regarded as a real number. (This was discussed in Sec. 9.10.) 
This number can be computed in terms of the partial derivatives of / and the 
derivatives of the components of y. as we shall now see. 

With respect to the standard basis {e! e„} of /?", [/(/)] is the n by 1 

matrix (a “column matrix") which has y\ ( t ) in the /th row, where y, y n are 

the components of y. For every xef, [/'(x)] is the 1 by n matrix (a “row matrix”) 
which has (D y /)(x) in the /th column. Hence [ g\t )] is the 1 by 1 matrix whose 
only entry is the real number 

(33) gV) = i(DJMt))y'i(t). 

i= 1 

This is a frequently encountered special case of the chain rule. It can be 
rephrased in the following manner. 

Associate with each x e E a vector, the so-called “gradient” of / at x, 
defined by 

(34) (V/)(x) = t ( A/)(x)e, . 

1= I 

Since 


(35) ?'(0 = Z Vi (Oe, , 

1 = 1 

(33) can be written in the form 

(36) g\t) = (V/)(y(/)) * y'(0- 

the scalar product of the vectors (V/)(y(r)) and y(t). 

Let us now fix an x e £, let u e R" be a unit vector (that is. |u| = 1), and 
specialize y so that 

(37) y(/) = x + ru (— cc < r < oc). 

Then y\t) = u for every t. Hence (36) shows that 

(38) g\ 0 ) = (V/)(x) • u. 

On the other hand, (37) shows that 

0(0 -0(0) =/(x + m) -/(x). 

Hence (38) gives 


|in| /(x + ,u)-/(x) 

t -*0 t 


= (V/)(X)U. 


(39) 
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The limit in (39) is usually called the directional derivative of f at x. in the 
direction of the unit vector u, and may be denoted by (Z) u /)(x). 

If / and x are fixed, but u varies, then (39) shows that (D u f)(x) attains its 
maximum when u is a positive scalar multiple of (V/)(x). [The case (V/)(x) = 0 
should be excluded here.] 

If u = Zi/,e f , then (39) shows that (Z) u /)(x) can be expressed in terms of 
the partial derivatives of/ at x by the formula 

(40) (Z)„/)(x) = X(A/)(x)w i . 

i = 1 

Some of these ideas will play a role in the following theorem. 

9.19 Theorem Suppose f maps a convex open set E c= R n into R m , f is differen- 
tiable in E , and there is a real number M such that 

Ilf'WII <M 

for every xe£ Then 

| f(b) - f(a) | < M | b - a | 

for all a e £, b e E. 

Proof Fix ae£,be£. Define 

y(t) = (1 - 

for all t e R l such that y(t) e E. Since E is convex, y(/) e E if 0 < t < 1. 
Put 

g(0=f(y(0). 

Then 


g'(0 = f'(y(0)/(0 = f'(y(0)(b-a), 

so that 

|g'(0| <Hf'(y(0)ll|b-a|<M|b-a| 

for all t e [0, 1 ]. By Theorem 5.19, 

|g(l)-g(0)| <M|b-a|. 

But g(0) = f(a) and g(l) = f(b). This completes the proof. 

Corollary If in addition , f'(x) = 0 for all xe E, then f is constant. 

Proof To prove this, note that the hypotheses of the theorem hold now 
with M = 0. 
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9.20 Definition A differentiable mapping f of an open set E c R n into R m is 
said to be continuously differentiable in E if f' is a continuous mapping of E 
into L(R\ R m ). 

More explicitly, it is required that to every xe E and to every e > 0 
corresponds a S > 0 such that 

llf'(y)-f'(x)ll<6 

if y e E and | x — y| < S. 

If this is so, we also say that f is a ^'-mapping, or that f e #'(£). 

9.21 Theorem Suppose f maps an open set E <= R n into R m . Then f e # \E ) if 
and only if the partial derivatives Djf exist and are continuous on E for 1 < / < m, 

1 < / < n. 

Proof Assume first that f e #'(£). By (27), 

(Djfi)(\) =(f'(x)e J ) ■ u, 
for all /, /, and for all xe E. Hence 

(DjfiXy) - (Djf)(x) = ([f'(y> - f '(x)]e,-} • u, 
and since |u, | = | e 7 - 1 = 1. it follows that 

\(Djf)( y) - (Djf)(x) | < | [f'(y) - f '(x)]e,-| 

< ||f'(y) - f'(x)||. 

Hence Djf is continuous. 

For the converse, it suffices to consider the case m = 1. (Why?) 
Fix xe E and e > 0. Since E is open, there is an open ball S a E, with 
center at x and radius r, and the continuity of the functions Djf shows 
that r can be chosen so that 

(41) | (/>,/)( y) - (Djf)(x)\ < E - (y e S, 1 <j < n). 

Suppose h = Z hj e y . | h | < r, put v 0 = 0, and \ k = h l e l + * • • + h k e k , 
for 1 < k < n . Then 

(42) /(x + h) - fix) = X [/(x + v,) -fix + v,._ j)]. 

j= i 

Since |v fc | < r for 1 < k < n and since S is convex, the segments with end 
points x + Vy_! and x + v y lie in S. Since v y = 4- hj e, , the mean 

value theorem (5.10) shows that the /th summand in (42) is equal to 

hfDjfiix +v j _ l +B j h j tj) 
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for some 0j e (0, 1), and this differs from hfDjf)(x) by less than | h } | e/n, 
using (41). By (42), it follows that 


f(x + h) -/(x) - X hj(Djf)(\) 
7=1 


i£- t |/»j|e<|h|e 
n j= i 


for all h such that |h| < r. 

This says that / is differentiable at x and that f\x ) is the linear 
function which assigns the number y Lh j (D j f){x) to the vector h = I// ,• . 

The matrix [/'(x)] consists of the row (D i/)(x) (D„/)(x); and since 

D u f .... D n f are continuous functions on £, the concluding remarks of 
Sec. 9.9 show that / e # \E ). 


THE CONTRACTION PRINCIPLE 

We now interrupt our discussion of differentiation to insert a fixed point 
theorem that is valid in arbitrary complete metric spaces. It will be used in the 
proof of the inverse function theorem. 

9.22 Definition Let A" be a metric space, with metric d. If cp maps X into X 
and if there is a number c < 1 such that 

(43) d{<p{x), <p(y)) <, c d(.\\ y) 

for all jc, y e A\ then <p is said to be a contraction of X into X. 

9.23 Theorem If X is a complete metric space , and if (p is a contraction of X 
into X , then there exists one and only one x e X such that <p(x) = x. 

In other words, cp has a unique fixed point. The uniqueness is a triviality, 
for if <p(.x) = * and (p(y) = y , then (43) gives d(x , ;’) < c d(x , which can only 
happen when d(x , y) = 0. 

The existence of a fixed point of (p is the essential part of the theorem. 
The proof actually furnishes a constructive method for locating the fixed point. 

Proof Pick x Q e X arbitrarily, and define { x„ } recursively, by setting 

(44) *„+ i = <p(x n ) (n = 0. 1,2,.. .). 

Choose c < 1 so that (43) holds. For n > 1 we then have 

d(x„ + 1 , x n ) = <%>(*„), q>{x n _ ,)) < c d(x n , x„_ t ). 

Hence induction gives 

d(x n + 1 , x„) < c" d(x u x 0 ) (« = 0, 1, 2 , . . .). 


(45) 
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If n < m, it follows that 

m 

d(x n ,xj< £ 

/ = n + 1 

< (c n + C n+1 + * * • 4* c m ~ l )d(x x , x 0 )> 

<[(l-cy l d(x u x 0 )]c n . 

Thus {x n } is a Cauchy sequence. Since X is complete, lim x n = x for some 
xe X. 

Since q> is a contraction, q> is continuous (in fact, uniformly con- 
tinuous) on X. Hence 

cp(x) = lim cp(x n ) = lim x n+1 = x. 

n-* ao n-»oo 


THE INVERSE FUNCTION THEOREM 

The inverse function theorem states, roughly speaking, that a continuously 
differentiable mapping f is invertible in a neighborhood of any point x at which 
the linear transformation f '(x) is invertible: 

9.24 Theorem Suppose f is a r 6' -mapping of an open set E c= R n into R n , f '(a) 
is invertible for some a e E. and b = f(a). Then 

(a) there exist open sets U and V in R n such that aet/.bel'f is one-to- 
one on U , and f (U) = V; 

(b) if g is the inverse of f [which exists , by ( 0 )], defined in V by 

g(f (x)) = x (xe l/), 

then ge<T( 10- 

Writing the equation y = f(x) in component form, we arrive at the follow- 
ing interpretation of the conclusion of the theorem: The system of n equations 

v f (1 </<«) 

can be solved for jc t , . . . , x n in terms of y x y n . if we restrict x and v to small 

enough neighborhoods of a and b; the solutions are unique and continuously 
differentiable. 

Proof 

(a) Put f '(a) = A , and choose X so that 

2X\\A' l \\ = 1. 


(46) 
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Since f ' is continuous at a. there is an open ball U a E, with center at a, 
such that 

(47) ||f'(x)-/l|| <X (x e U). 

We associate to each y e a function <p, defined by 

(48) </>(x) = x + /T‘(y - f(x)) (xe£). 

Note that f(x)* = y if and only if x is a fixed point of cp. 

Since cp\x) = / - /4 _1 f'(x) = A~\A - f'(x)), (46) and (47) imply 

that 

(49) ll<p'(x)||<4 (x e U). 

Hence 

(50) l<P(x,) - <p(x 2 )| <i|x, -x 2 | (x,, x 2 e V). 

by Theorem 9.19. It follows that cp has at most one fixed point in U , so 
that f(x) = y for at most one x e U. 

Thus f is 1-1 in U. 

Next, put V = f(l/), and pick y 0 e V. Then y 0 = f(x 0 ) for some 
x 0 e U. Let B be an open ball with center at x 0 and radius r > 0, so small 
that its closure B lies in U. We will show that y e V whenever | y — y 0 1 < Ar. 
This proves, of course, that V is open. 

Fix y, | v — y 0 1 < Ar. With cp as in (48). 

|<p(x 0 )-x 0 | = M~‘(y-y 0 )| < \\A~ l \\Xr = y 

If x e S, it therefore follows from (50) that 

k(x) - X 0 | ^ |<p(x) - <^>(X 0 )| + |<p(x 0 ) - x 0 | 

I . r 

<2 I x — x ol +2 

hence cp(\) e B. Note that (50) holds if \ { e 5, x 2 e B. 

Thus cp is a contraction of B into B . Being a closed subset of /?", 
B is complete. Theorem 9.23 implies therefore that cp has a fixed point 
x e B. For this x, /(x) — y. Thus y ef(B)cf((/) - V. 

This proves part (a) of the theorem. 

(b) Pick y e F, y + k e V. Then there exist x e (/, x + h e U, so that 
y = f(x), y 4- k = f(x + h). With cp as in (48), 

cp(\ + h) - cp(\) = h + A ~ 1 [f (x) — f (x + h)] = h - A * ! k. 

By (50), \h — A~ l k\ <i|h|. Hence | A~'k\ >i|h|, and 

|h| < 2M _1 || | k | = A" 1 |k|. 


(51) 
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By (46), (47), and Theorem 9.8, f'(x) has an inverse, say T. Since 
g(y + k) - g(y) - Tk = h - Tk = - T[f (x + h) - f (x) - f'(x)h], 

(51) implies 

|g(y + k) - g(y) - Tk\ ^ \\T\\ |f(x + h) - f(x) - f '(x)h| 

|k| “ A ' h 

As k -» 0, (51) shows that h -» 0. The right side of the last inequality 
thus tends to 0. Hence the same is true of the left. We have thus proved 
that g'(y) = T. But T was chosen to be the inverse of f '(x) = f '(g(y)). Thus 

(52) g'(y) ={f'(g(y))} _1 (yeK). 

Finally, note that g is a continuous mapping of V onto U (since g 
is differentiable), that f ' is a continuous mapping of U into the set Q of 
all invertible elements of L(R n ), and that inversion is a continuous mapping 
of Q onto Q, by Theorem 9.8. If we combine these facts with (52), we see 
that g e \V ). 

This completes the proof. 

Remark. The full force of the assumption that / e '(E ) was only used 
in the last paragraph of the preceding proof. Everything else, down to Eq. (52), 
was derived from the existence of f '(x) for xef, the invertibility of f(a), and 
the continuity of f ' at just the point a. In this connection, we refer to the article 
by A. Nijenhuis in Amer. Math. Monthly , vol. 81, 1974, pp. 969-980. 

The following is an immediate consequence of part (a) of the inverse 
function theorem. 

9.25 Theorem 7/f is a -mapping of an open set E cz R n into R n and if f '(x) 
is invertible for every xef, then f(fF) is an open subset of R n for every open set 
WezE. 

In other words, f is an open mapping of E into R n . 

The hypotheses made in this theorem ensure that each point xe E has a 
neighborhood in which f is 1-1. This may be expressed by saying that f is 
locally one-to-one in E. But f need not be 1-1 in £ under these circumstances. 
For an example, see Exercise 17. 


THE IMPLICIT FUNCTION THEOREM 

If / is a continuously differentiable real function in the plane, then the equation 
f(x , y) = 0 can be solved for y in terms of * in a neighborhood of any point 
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(a, b) at which f(a , b) = 0 and df/dy # 0. Likewise, one can solve for x in terms 
of y near ( a , b) if df/dx # 0 at ( a . b). For a simple example which illustrates 
the need for assuming df/dy # 0, consider /(*, y) = x 2 + y 2 — 1. 

The preceding very informal statement is the simplest case (the case 
m = n = 1 of Theorem 9.28) of the so-called “implicit function theorem.” Its 
proof makes stronguse of the fact that continuously differentiable transformations 
behave locally very much like their derivatives. Accordingly, we first prove 
Theorem 9.27, the linear version of Theorem 9.28. 

9.26 Notation If x = (jc t , . . . , *„) e R n and y = 0>i> • • • , y m ) e R m i let us write 
(x, y) for the point (or vector) 

(*i x n ,y i ,...,yJeR n * m . 

In what follows, the first entry in (x, y) or in a similar symbol will always be a 
vector in /?", the second will be a vector in R m . 

Every A e L(R n * m , R n ) can be split into two linear transformations A x and 
A y , defined by 

(53) A x h = A (h, 0), A y k = A( 0, k) 

for any he/?",ke R m . Then A x e L(R n )< A y e L(R m , /?"), and 

(54) /4(h,k) =/t x h + A y k. 

The linear version of the implicit function theorem is now almost obvious. 

9.27 Theorem If A e L(R n+m , R n ) and if A x is invertible , then there corresponds 
to every ke R m a unique he/?" such that /f(h, k) = 0. 

This h can be computed from k by the formula 

(55) h= -(A x Y'A y k. 

Proof By (54). A( h, k) = 0 if and only if 

A x h + Ay k = 0, 

which is the same as (55) when A x is invertible. 

The conclusion of Theorem 9.27 is, in other words, that the equation 
/4(h. k) = 0 can be solved (uniquely) for h if k is given, and that the solution h 
is a linear function of k. Those who have some acquaintance with linear algebra 
will recognize this as a very familiar statement about systems of linear equations. 

9.28 Theorem Let I be a <6' -mapping of an open set E c: R n + m into R n , such 
that f (a. b) --- 0 for some point (a, b) e E. 

Put A = f '(a, b) and assume that A x is invertible. 



FUNCTIONS OF SEVERAL VARIABLES 225 


Then there exist open sets U a R n+m and W c: R m , with (a, b) e U and 
b e W, having the following property: 

To every y e W corresponds a unique x such that 

(56) (x, y )e U and f (x, y) = 0. 

If this x is defined to be g(y), then g is a W -mapping of W into R n , g(b) = a, 

(57) f(g(y), y) = o (y eW), 

and 

(58) g'(b )=-(A x y l A y . 

The function g is “implicitly” defined by (57). Hence the name of the 
theorem. 

The equation f(x, y) = 0 can be written as a system of n equations in 
n + m variables: 


(59) 


/i(*i> • ••>*„> yu---, yJ = o 
/„(*., • • • > x n , y% 


The assumption that A x is invertible means that the n by n matrix 


'A/, • 

• A./l 

DJn 



evaluated at (a, b) defines an invertible linear operator in R n ; in other words, 
its column vectors should be independent, or, equivalently, its determinant 
should be =♦=(). (See Theorem 9.36.) If, furthermore, (59) holds when x = a and 
y = b, then the conclusion of the theorem is that (59) can be solved for x l9 . . . , x n 
in terms of y l9 . . . , y m , for every y near b, and that these solutions are continu- 
ously differentiable functions of y. 


Proof Define F by 

(60) F(x, y) = (f (x, y), y) ((x, y) e E). 

Then F is a ^'-mapping of E into R n+m . We claim that F'(a, b) is an 
invertible element of L(R n+m ): 

Since f(a, b) = 0, we have 

f (a + h, b + k) = h, k) + r(h, k), 

where r is the remainder that occurs in the definition of f '(a, b). Since 

F(a + h, b + k) - F(a, b) = (f (a + h, b + k), k) 

= (A(h, k), k) 4- (r(h, k), 0) 
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it follows that F'(a. b) is the linear operator on R n+m that maps (h, k) to 
(A( h, k), k). If this image vector is 0, then A( h, k) = 0 and k = 0, hence 
A( h, 0) = 0, and Theorem 9.27 implies that h = 0. It follows that F'(a. b) 
is 1-1 ; hence it is invertible (Theorem 9.5). 

The inverse function theorem can therefore be applied to F. It shows 
that there exist open sets U and V in R n + m , with (a, b) e U , (0, b) e K, such 
that F is a 1-1 mapping of U onto V. 

We let W be the set of all ye R m such that (0, y) e V. Note that 
be W. 

It is clear that W is open since V is open. 

If y e W, then (0, y) = F(x, y) for some (x, y) e U. By (60), f (x, y) = 0 
for this x. 

Suppose, with the same y, that (x\ y) e (J and f(x\ y) = 0. Then 

F(x\ y) = (f(x', y), y) = (f(x, y). y) = F(x, y). 

Since F is 1-1 in U , it follows that x' = x. 

This proves the first part of the theorem. 

For the second part, define g(y), for y e W, so that (g(y). y) e U and 
(57) holds. Then 

(61) F(g(y). y) = (0, y) (y e W). 

If G is the mapping of V onto U that inverts F, then G e by the inverse 
function theorem, and (61) gives 

(62) (g(y), y) = G(0, y) (y e W). 

Since G e (62) shows that g e ‘if'. 

Finally, to compute g'(b), put (g(y), y) = d)(y). Then 

(63) <t>'(y)k = (g'(y)k, k) (ye IF, k e R m ). 

By (57), f (4>(y)) = 0 in W. The chain rule shows therefore that 

f'Wy))<t>'(y) = 0 . 

When y = b, then <t>(y) = (a, b), and f (<t>(y)) = A. Thus 

(64) /4<t>'(b) = 0. 

It now follows from (64), (63), and (54), that 

/4 x g'(b)k + A y k = /4(g'(b)k, k) = /KD'(b)k = 0 
for every k e R m . Thus 


(65) 


A x g'(b) + A y =0. 
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This is equivalent to (58), and completes the proof. 


Note. In terms of the components of f and g, (65) becomes 

I b)( />*<?, )(b) = -(A. + */,)( a, b) 

j= 1 

or 



where 1 < / < n, 1 < k < m. 

For each k , this is a system of n linear equations in which the derivatives 
dgj/dy k (1 <j<n) are the unknowns. 


9.29 Example Take n = 2, m = 3, and consider the mapping f = (/ 1? / 2 ) of 
R 5 into R 2 given by 

fi(xi>x 2 , y l ,y 2 ,y 2 ) =2e Xl +^ 2^1 -4^ + 3 

i, x 2 ,y u y 2 , y 3 ) = x 2 cos x x - 6x { + 2y 1 - y 3 . 

If a = (0, 1) and b = (3, 2, 7), then f(a, b) = 0. 

With respect to the standard bases, the matrix of the transformation 
A = f '(a, b) is 


Hence 




[Ay ] = 


-4 

0 



We see that the column vectors of [ A x ] are independent. Hence A x is invertible 
and the implicit function theorem asserts the existence of a ^'-mapping g, defined 
in a neighborhood of (3, 2, 7), such that g(3, 2, 7) = (0, 1) and f (g(y), y) = 0. 
We can use (58) to compute g'(3, 2, 7): Since 




(58) gives 


[g'(3, 2, 7)] = 
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In terms of partial derivatives, the conclusion is that 

°l9l=i £> 201=1 ^301 = -To 

£>102 = — i £>2 02 = T £>3 02 = To 

at the [k)int (3, 2, 7). 


THE RANK THEOREM 

Although this theorem is not as important as the inverse function theorem or 
the implicit function theorem, we include it as another interesting illustration 
of the general principle that the local behavior of a continuously differentiable 
mapping F near a point x is similar to that of the linear transformation F'(x). 
Before stating it, we need a few more facts about linear transformations. 

9.30 Definitions Suppose X and Tare vector spaces, and A gL(A", T), as in 
Definition 9.6. The null space of A , .Y(A), is the set of all x e A' at which Ax = 0. 
It is clear that Jf(A) is a vector space in X. 

Likewise, the range of A , &(A), is a vector space in Y. 

The rank of A is defined to be the dimension of y#(A). 

For example, the invertible elements of L(P”) are precisely those whose 
rank is n. This follows from Theorem 9.5. 

If A e L( X , Y) and A has rankO, then Ax =0 for all xe A, hence^r^) = X. 
In this connection, see Exercise 25. 

9.31 Projections Let A" be a vector space. An operator P e L(X) is said to be 
a projection in X if P 2 = P. 

More explicitly, the requirement is that P(Px) = Px for every x e X. In 
other words, P fixes every vector in its range ,#(P). 

Here are some elementary properties of projections: 

{a) If P is a projection in X, then every x e X has a unique representation 
of the form 


X = x, + x 2 

where x, e ^?(P), x 2 e ^T(P). 

To obtain the representation, put x, = Px, x 2 = x — x,. Then 
Px 2 = Px — Px, = Px — P 2 x = 0. As regards the uniqueness, apply P to 
the equation x = x, +x 2 . Since x, e ^(P), Px, = x, ; since Px 2 = 0, it 
follows that x, = Px. 

( b ) If X is a finite-dimensional vector space and if X x is a vector space in 
A', then there is a projection P in X with <#(P) = X x . 
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If X x contains only 0, this is trivial : put Px = 0 for all xe X. 
Assume dim X x = k > 0. By Theorem 9.3, X has then a basis 
{Uj, . . . , u„) such that {u l5 . . . , u k } is a basis of X x . Define 

P(c t u x +•■■ + <•„ U„) = c l +•■■+<•* U k 

for arbitrary scalars c l9 ... , c n . 

Then Px = x for every xe X l9 and X t = $(P). 

Note that {u k + |, . . . , u„} is a basis of Jf(P). Note also that there are 
infinitely many projections in X , with range X l9 if 0 < dim X { < dim X. 

9.32 Theorem Suppose m, w, r are nonnegative integers , m > r. n > r, F is a 
mapping of an open set E c R n into R m , and F'(x) has rank r for every xe E. 
Fix ae£, put A = F'(a), let Y { be the range of A % and let P be a projection 
in R m whose range is Y x . Let Y 2 be the null space of P. 

Then there are open sets U and V in R n , with a e U 9 U c £, and there is a 
1-1 ^'-mapping H of V onto V (whose inverse is also of class <&’) such that 

(66) F(H(x)) = Ax 4- (p(Ax) (x e V) 

where (p is a <€’ -mapping of the open set A(V) cz Y x into Y 2 . 

After the proof we shall give a more geometric description of the informa- 
tion that (66) contains. 

Proof If r = 0, Theorem 9.19 shows that F(x) is constant in a neighbor- 
hood V of a, and (66) holds trivially, with V = Lk H(x) = x, (p{ 0) = F(a). 

From now on we assume r > 0. Since dim Y { = r, Y x has a basis 
{y ! , . . . , y r }. Choose z, e R n so that Az { = y, (1 < i < r), and define a linear 
mapping S of Y { into R n by setting 

(67) 5(r,y, + ■ • • + c r y r ) = r,z, + • • • + c r z r 

for all scalars c { , . . . , c r . 

Then ASy t = Az t = y, for 1 < i < r. Thus 

(68) ASy = y (y e Y t ). 

Define a mapping G of E into R n by setting 

(69) G(x) = x + 5P[F(x) -Ax] (xe E). 

Since F'(a) = A , differentiation of (69) shows that G (a) = /, the identity 
operator on R n . By the inverse function theorem, there are open sets U 
and V in P”, with a e(/, such that G is a 1-1 mapping of V onto V whose 
inverse H is also of class Moreover, by shrinking U and F, if necessary, 
we can arrange it so that V is convex and H'(x) is invertible for every xe V. 
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Note that ASP A = A , since PA = A and (68) holds. Therefore (69) 

gives 


(70) 

AG(x) = PF(x) 

(x 6 £). 


In particular, (70) holds for xe U. If we 

replace x by H(x), we obtain 

(71) 

PF(H(x)) = Ax 

Define 

(X £ V). 

(72) 

iA(x) = F(H(x)) - Ax 

(X 6 V). 


Since PA = A, (71) implies that PiK x ) = 
^'-mapping of V into Y 2 . 

■ 0 for all x e V. Thus is a 


Since V is open, it is clear that A(V) is an open subset of its range 
M(A) = Y x . 

To complete the proof, i.e., to go from (72) to (66), we have to show 
that there is a ^'-mapping cp of A(V ) into Y 2 which satisfies 

(73) <p(Ax) = iKx) (x e V). 

As a step toward (73), we will first prove that 

(74) *(x,) = Hx 2 ) 
if Xj e K x 2 e K Ax\ = Ax 2 . 

Put d>(x) = F(H(x)), for xeK Since H'(x) has rank n for every 
xe V, and F'(x) has rank r for every x e £/, it follows that 

(75) rank d>'(x) = rank F'(H(x))H'(x) = r (x e V). 

Fix x e V. Let M be the range of <D'(x). Then M c= /? m , dim M = r. 
By (71), 

(76) POXx) = A. 

Thus P maps M onto @(A) = Y { . Since M and Y { have the same di- 
mension, it follows that P (restricted to M) is 1-1. 

Suppose now that Ah = 0. Then Pd>'(x)h = 0, by (76). But 
<I>'(x)h e A/, and P is 1-1 on M. Hence (fr'OOli = 0. A look at (72) shows 
now that we have proved the following: 

If xe V and Ah = 0, then *l/'(x)h = 0. 

We can now prove (74). Suppose x, e K x 2 e K Ax { = Ax 2 . Put 
h = x 2 — Xj and define 

(77) g(/) = •/'(x, + /h) (0 < / < 1). 

The convexity of V shows that x, + /h e V for these /. Hence 
g \t) = i/>'( x i + fh)h =0 (0 < t < 1), 


( 78 ) 



FUNCTIONS OF SEVERAL VARIABLES 231 


so that g(l) = g(0). But g(l) = i//(x 2 ) and g(0) = ^( x i)- This proves (74). 

By (74), 1 j/(x) depends only on Ax, for xe V. Hence (73) defines (p 
unambiguously in A(V). It only remains to be proved that (p e 

Fix y 0 e A(V ), fix x 0 e V so that Ax 0 = y 0 . Since V is open, y 0 has 
a neighborhood W in Y l such that the vector 

(79) x = x 0 + S(y - y 0 ) 
lies in V for all y e W. By (68), 

Ax = Ax 0 + y - y 0 = y. 

Thus (73) and (79) give 

(80) <p(y) = >J/(x 0 - Sy 0 + Sy) (y e IV). 

This formula shows that cp e <€’ in W, hence in A(V), since y 0 was chosen 
arbitrarily in A(V). 

The proof is now complete. 

Here is what the theorem tells us about the geometry of the mapping F. 
If y g F (GO then y = F(H(x)) for some xe V, and (66) shows that Py = Ax. 
Therefore 

(81) y = Py + (p(Py) (y e F(£/)). 

This shows that y is determined by its projection Py, and that P, restricted 
to F (U), is a 1-1 mapping of F (U) onto A(V). Thus F (U) is an “r-dimensional 
surface” with precisely one point “over” each point of A(V). We may also 
regard F (U) as the graph of (p . 

If d>(x) = F(H(x)), as in the proof, then (66) shows that the level sets of O 
(these are the sets on which attains a given value) are precisely the level sets of 
A in V. These are “flat” since they are intersections with V of translates of the 
vector space ,V(A). Note that dim A r {A) = n — r (Exercise 25). 

The level sets of F in U are the images under H of the flat level sets of 
in V. They are thus “(a? — r)-dimensional surfaces” in U. 


DETERMINANTS 

Determinants are numbers associated to square matrices, and hence to the 
operators represented by such matrices. They are 0 if and only if the corre- 
sponding operator fails to be invertible. They can therefore be used to decide 
whether the hypotheses of some of the preceding theorems are satisfied. They 
will play an even more important role in Chap. 10. 



232 PRINCIPLES OF MATHEMATICAL ANALYSIS 


9.33 Definition If (j u . . . ,j n ) is an ordered zi-tuple of integers, define 

(82) s(j \ , . . . ,y B ) = n sgn (jq - jp)< 

p<q 

where sgn x = 1 if x > 0, sgn x = — 1 if x < 0, sgn x = 0 if x = 0. Then 
s(j x , ... ,j n ) = 1, — 1, or 0, and it changes sign if any two of the /s are inter- 
changed. 

Let [A] be the matrix of a linear operator A on R n . relative to the standard 
basis {e 1? . . . , e„), with entries a(i,j) in the zth row and jth column. The deter- 
minant of [A] is defined to be the number 

(83) det [A] = £ s Ui j„)a( 1 , ji )a( 2, j 2 ) ■ a(n, j„). 

The sum in (83) extends over all ordered /7-tuples of integers (j u . . . , j n ) with 

1 <Jr 

The column vectors Xj of [A] are 

(84) Xj = X a(i,j)e t (1 <j < n). 

1=1 

It will be convenient to think of det [A] as a function of the column vectors 
of [A]. If we write 

det (Xj, . . . , x„) = det [A], 

det is now a real function on the set of all ordered ^-tuples of vectors in R n . 

9.34 Theorem 

(a ) If I is the identity operator on R n , then 

det [/] = det (e^ . . . , e„) = 1. 

(b) det is a linear function of each of the column vectors Xj , if the others are 
held fixed. 

(c) If [A], is obtained from [A] by interchanging two columns , then 
det [A ] i = —det [A]. 

(d) If [A] has two equal columns , then det [A] =0. 

Proof If A = /, then a(i, i) = 1 and a(ij) = 0 for / # j. Hence 
det [/] =5(1,2, ...,ai) = 1, 

which proves (a). By (82), s(j x . . . . J n ) = 0 if any two of the /’ s are equal. 
Each of the remaining n\ products in (83) contains exactly one factor 
from each column. This proves ( b ). Part (c) is an immediate consequence 
of the fact that s(j\, ...,/„) changes sign if any two of the /’s are inter- 
changed, and ( d ) is a corollary of ( c ). 
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9.35 Theorem If [A] and [B] are n by n matrices , then 

det ([5JM])=det[5] del [A]. 

Proof If Xj, . . . , x„ are the columns of [A ], define 

(85) A b (x„ . . . , x„) = A b [A] = det ([£][/!]). 

The columns of [ B][A ] are the vectors B\ { , . . . , Bx n . Thus 

(86) A B (Xi, . . . , x„) = det (Bx u Bx n ). 

By (86) and Theorem 9.34, A B also has properties 9.34 ( b ) to (d). By ( b ) 
and (84), 

A b [A] = A b (X «('- Oe,-, x 2 , . . . , x„j = X a(i, 1) A B (e f , x 2 , . . . , x„). 

Repeating this process with x 2 , . . . , x„ , we obtain 

(87) A b [A] = X «0‘i, l)a(* 2 , 2) • • • a(i n , ri) A^e,., . . . , e in ), 

the sum being extended over all ordered w-tuples (/ 1 , . i w ) with 
1 < i r < n. By (c) and (d), 

(88) A B (e fl , . . . , e,„) = f(/ t , . . . , /„) A B (e!, . . . , e„), 
where t = 1, 0, or — 1, and since [B][7] == [B], (85) shows that 

(89) A B (e t , . . . , e„) = det [B]. 

Substituting (89) and (88) into (87), we obtain 

det ([B][A]) = { S a(i u 1) • • * a(i n , , /„)} det [B], 

for all n by n matrices [A] and [B]. Taking B =/, we see that the above 
sum in braces is det [A]. This proves the theorem. 

9.36 Theorem A linear operator A on R n is invertible if and only if det [A] #0. 
Proof If A is invertible, Theorem 9.35 shows that 

det [A] det [A~ l ] = det [AA~ X ] = det [/] = 1, 
so that det [A] #0. 

If A is not invertible, the columns x l5 . . . , x„ of [A] are dependent 
(Theorem 9.5); hence there is one, say, x*, such that 

(90) x* + X c j Xj = 0 

i*k 

for certain scalars Cj. By 9.34 (b) and (< d ), \ k can be replaced by x* + CjXj 
without altering the determinant, if j # k. Repeating, we see that x fc can 
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be replaced by the left side of (90), i.e., by 0, without altering the deter- 
minant. But a matrix which has 0 for one column has determinant 0. 
Hence det [A] =0. 

9.37 Remark Suppose {e l9 . . . , e,,} and {u l5 . . . , u„} are bases in R n . 
Every linear operator A on R n determines matrices [A] and [A ] v , with entries 
dij and (Xij , given by 

Ae j = L a o e .’ Au j = 'L a h u i- 

i i 

If u j = Btj = , then Auj is equal to 

Z a kj B *k = Z « kj Z b; k e, = Z | Z b ik a* A e, . 

k k i i \k J 

and also to 

ABtj = A Y. b kj e k = Z (Z a ik b kj j e, . 

Thus 'Lb ik <x kj = I a ik b kj , or 

(91) [B][A] V = [A][B]. 

Since B is invertible, det [B] #0. Hence (91), combined with Theorem 9.35, 
shows that 

(92) det [A]^ = det [A]. 

The determinant of the matrix of a linear operator does therefore not 
depend on the basis which is used to construct the matrix. It is thus meaningful 
to speak of the determinant of a linear operator , without having any basis in mind. 


9.38 Jacobians If f maps an open set E a R" into R n , and if f is differen- 
tiable at a point xg£, the determinant of the linear operator f'(x) is called 
the Jacobian of f at x. In symbols, 


(93) 


7 f (x) = det f '(x). 


We shall also use the notation 
(94) 


d(x u ...,x n ) 


for J f (x), if O',, ...,y n ) = f(.v„ . . . , x n ). 

In terms of Jacobians, the crucial hypothesis in the inverse function 
theorem is that J f ( a) # 0 (compare Theorem 9.36). If the implicit function 
theorem is stated in terms of the functions (59), the assumption made there on 
A amounts to 

bJJ_ i,. ■■,/,) 
d(*,> •••»*») 
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DERIVATIVES OF HIGHER ORDER 

9.39 Definition Suppose / is a real function defined in an open set £ <= R n , 
with partial derivatives D x f . . . , D n f If the functions Djf are themselves 
differentiable, then the second-order partial derivatives of / are defined by 

Dijf= D { Djf (/.y 

If all these functions D xj f are continuous in £, we say that /is of class <6" in £, 
or that feW(E). 

A mapping f of £ into R m is said to be of class <6" if each component of f 
is of class <6 * . 

It can happen that D xj f D jx f at some point, although both derivatives 
exist (see Exercise 27). However, we shall see below that D xj f = D y J whenever 
these derivatives are continuous. 

For simplicity (and without loss of generality) we state our next two 
theorems for real functions of two variables. The first one is a mean value 
theorem. 


9.40 Theorem Suppose f is defined in an open set E a R 2 , and D x f and D lx f 
exist at every point of £. Suppose Jc£ is a closed rectangle with sides parallel 
to the coordinate axes , having ( a , b) and (a +h % b + k) as opposite vertices 
(/? ^ 0, k 7 ^ 0). Put 

A (/ Q) = f(a + h. b + k) -f(a 4 - /?, b) - f(a , b + k) +f (a, b). 

Then there is a point ( x , j) in the interior of Q such that 

(95) A (fiQ)=hk{D z J)(x,y). 

Note the analogy between (95) and Theorem 5.10; the area of Q is hk. 

Proof Put u(t) =f(t % b + k) — /(/, b). Two applications of Theorem 5.10 
show that there is an x between a and a + h, and that there is a y between 
b and b + £, such that 

A(/. Q) = u{a + h) - u{a) 

= hu\x) 

= h[(DJ)(x, b + k)- ( Dj)(x , 6 )] 

= hk(n 2l f)(x,y). 

9.41 Theorem Suppose f is defined in an open set E c R 2 , suppose that D x f, 
D 2 J. and D 2 f exist at every point of E, and D 2i f is continuous at some point 
(a, b) e E. 
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Then D xl f exists at ( a , b) and 


(96) 


(D l2 f)(a,b)=(D 2l f)(a. b). 


Corollary D 2l f=D i2 f iffe V(E). 


Proof Put A = (D 2i f )(a. b). Choose e > 0. If Q is a rectangle as in 
Theorem 9.40, and if h and k are sufficiently small, we have 

\A — (Z) 21 /)(.v, r)| < e 

for all (x. y) e Q. Thus 

h(/.G) A _ 


by (95). Fix h , and let A' -> 0. Since D 2 f exists in £, the last inequality 
implies that 


(97) 


(D 2 f)(a + h.b)- ( D 2 f){a . Z>) 


Since e was arbitrary, and since (97) holds for all sufficiently small 
/z # 0, it follows that ( D l2 f)(a , 6) = /I. This gives (96). 


DIFFERENTIATION OF INTEGRALS 

Suppose cp is a function of two variables which can be integrated with respect 
to one and which can be differentiated with respect to the other. Under what 
conditions will the result be the same if these two limit processes are carried out 
in the opposite order? To state the question more precisely: Under what 
conditions on q> can one prove that the equation 

d c h r h 

(98) — [ <p(x, t)dx = -i- (x. t) dx 

dt ‘a >' a Ct 

is true? (A counter example is furnished by Exercise 28.) 

It will be convenient to use the notation 

(99) cp\x) = </>(*, /). 

Thus cp* is, for each /, a function of one variable. 

9.42 Theorem Suppose 

(a) cp(x, t) is defined for a < x < b, c < t < d; 

(b) y. is an increasing function on [a, b ] ; 
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(c) cp 1 e 3?(a) for every t e [c, d ] ; 

( d ) c < s < d, and to every s > 0 corresponds a S > 0 such that 

\(D 2 <p)(x, t) - (D 2 (p)(x, s)\ < s 
for all x e [ a , b ] and for all t e (s — <5, s 4- S). 


Define 


( 100 ) 


r b 

/(/) = | cp(x , t) da(x) (c <t < d). 


Then (Z) 2 (p) s e &(?L),f\s) exists , and 


( 101 ) 


/'(s) = [ (D 2 cp)(x, s ) di(x). 

" a 


Note that (c) simply asserts the existence of the integrals (100) for all 
t e [c\ d]. Note also that (d) certainly holds whenever D 2 cp is continuous on the 
rectangle on which cp is defined. 

Proof Consider the difference quotients 

<p(x , /) - cp(x , s) 


M* 0 = 


t — s 


for 0 < 1 1 — s\ < S. By Theorem 5.10 there corresponds to each ( x , /) a 
number u between s and t such that 

il/(x<t)=(D 2 cp )(.v, u). 

Hence (d) implies that 

(102) | \p(x, t) — (D 2 cp)(x, .?)| < e ( a < x < b, 0 < \t - s\ < S). 

Note that 

m-m c \ 


(103) 


t — s 


= f 0 ^a(-v). 

J n 


By (102), ij/' (D 2 cp)\ uniformly on [a, b], as / -► s. Since each 
f e J(a), the desired conclusion follows from (103) and Theorem 7.16. 


9.43 Example One can of course prove analogues of Theorem 9.42 with 
( — oo, oo ) in place of [< a , b]. Instead of doing this, let us simply look at an 
example. Define 


( 104 ) 


f(t) = f e * 2 cos ( xt ) dx 

" — CD 
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f °° 

g(t) = — xe~ xl sin ( xt ) dx , 

" —CO 


for — oo < / < oo. Both integrals exist (they converge absolutely) since the 
absolute values of the integrands are at most exp ( — x 2 ) and |jc| exp(-x 2 ), 
respectively. 

Note that g is obtained from /by differentiating the integrand with respect 
to /. We claim that /is differentiable and that 

(106) f\t) =g(t) (-oo</<oo). 

To prove this, let us first examine the difference quotients of the cosine: 
if p > 0, then 


cos (a + P) — cos a 


1 /* 2+ ^ 

+ sin a = - (sin a — sin t) dt. 


Since | sin a - sin /| < |/ — a |, the right side of (107) is at most P/2 in absolute 
value; the case P < 0 is handled similarly. Thus 


I cos (a + P) — cos a 


4- sin a < \P\ 


for all P (if the left side is interpreted to be 0 when P = 0). 

Now fix /, and fix h ^ 0. Apply (108) with y. = xt* P = xh\ it follows from 
(104) and (105) that 


At + h) -/(/) 

h 


g(t) < \h\ f x 2 c * 2 dx. 

^ — an 


When h -> 0, we thus obtain (106). 

Let us go a step further: An integration by parts, applied to (104), shows 


/m< 


2 sin (xt) 

xe x dx. 

0 t 


Thus //(/)= —2 g(t), and (106) implies now that / satisfies the differential 
equation 

( 110 ) 2/'(0 + //(/)= 0 . 

If we solve this differential equation and use the fact that /(0) = Jn (see Sec. 
8.21), we find that 


f{t) = v 


,.xp(-y. 


The integral (104) is thus explicitly determined. 
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EXERCISES 

1. If 5 is a nonempty subset of a vector space X, prove (as asserted in Sec. 9.1) that 
the span of 5 is a vector space. 

2. Prove (as asserted in Sec. 9.6) that BA is linear if A and flare linear transformations. 

Prove also that A' 1 is linear and invertible. 

3. Assume A e L(X, Y) and Ax = 0 only when x = 0. Prove that A is then 1-1 . 

4. Prove (as asserted in Sec. 9.30) that null spaces and ranges of linear transforma- 
tions are vector spaces. 

5. Prove that to every A e L(R\ R l ) corresponds a unique y e R n such that Ax = x *y. 
Prove also that \\A || = | y | . 

Hint : Under certain conditions, equality holds in the Schwarz inequality. 

6. If/(0, 0) = 0 and 


= 7T^1 if (*,>>)* (0,0), 

prove that (D { f)(x,y) and (D 2 f)(x, y) exist at every point of R 2 , although / is 
not continuous at (0, 0). 

7. Suppose that / is a real-valued function defined in an open set E A", and that 
the partial derivatives D,/, . . . , D n f are bounded in E. Prove that /is continuous 
in E. 

Hint: Proceed as in the proof of Theorem 9.21. 

8. Suppose that / is a differentiable real function in an open set £ <= A", and that / 
has a local maximum at a point x e £. Prove that f\x) = 0. 

9. If f is a differentiable mapping of a connected open set £ <=■ A" into A m , and if 
f(x) = 0 for every x e £, prove that f is constant in £. 

10. If/ is a real function defined in a convex open set £ <= A", such that (D,/)(x) = 0 
for every x c £, prove that /(x) depends only on x 2 , . . . , .v„ . 

Show that the convexity of £ can be replaced by a weaker condition, but 
that some condition is required. For example, if n = 2 and £ is shaped like a 
horseshoe, the statement may be false. 

11. If/ and g are differentiable real functions in A", prove that 

V(fff)=fVg +g v/ 

and that V(l//) = —/" 2 V/ wherever f ^ 0. 

12. Fix two real numbers a and 6, 0 <a < b. Define a mapping f = (/,/ 2 ,/ 3 ) of A 2 
into A 3 by 


/,(.?, /) = (b -b a cos s) cos t 
f 2 (s , t ) = ( b -}- a cos s) sin t 
fi(s, t) = a sin s. 
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Describe the range K of f. (It is a certain compact subset of R 3 .) 

(a) Show that there are exactly 4 points p g K such that 

(VAX f“ 1 (p))=0. 

Find these points. 

( b ) Determine the set of all q g K such that 

(V/iXf“ 1 (q)) = 0. 

(c) Show that one of the points p found in part (a) corresponds to a local maxi- 
mum of /i, one corresponds to a local minimum, and that the other two are 
neither (they are so-called ‘‘saddle points”). 

Which of the points q found in part ( b ) correspond to maxima or minima? 

(d) Let A be an irrational real number, and define g(/) = f (/, A/). Prove that g is a 
1-1 mapping of R l onto a dense subset of K. Prove that 

| g'(0| 2 — a 2 + A 2 (b 4 - a cos t ) 2 . 

13. Suppose f is a differentiable mapping of R l into R 3 such that |f(/)| -= 1 for every t. 
Prove that f'(/)*f(/) = 0. 

Interpret this result geometrically. 

14. Define /( 0, 0) = 0 and 

/(*, y) = —r — 2 if (- v ’ y) ^ (°. °)- 

x 2 -t- y 2 

(a) Prove that DJ and D 2 J are bounded functions in R 2 . (Hence /is continuous.) 

( b ) Let u be any unit vector in R 2 . Show that the directional derivative (Z) u /)( 0, 0) 
exists, and that its absolute value is at most 1 . 

(c) Let y be a differentiable mapping of R ] into R 2 (in other words, y is a differ- 
entiable curve in R 2 ), with y(0) = (0, 0) and |y'(0)| > 0. Put g(r) = f(y(t )) and 
prove that g is differentiable for every t e R l . 

If y g prove that g g . 

( d ) In spite of this, prove that / is not differentiable at (0, 0). 

Hint: Formula (40) fails. 

15. Define /( 0, 0) =■--=• 0, and put 


fix , y) — x 2 4 - y 2 — 2 x 2 y - 


4 x 6 y 2 
( x 4 f y 2 ) 2 


if (x,y)*( 0, 0). 

(a) Prove, for all (x y y) g R 2 , that 


4 x 4 y 2 < ( x 4 f y 2 ) 2 . 


Conclude that /is continuous. 
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(b) For 0 <; d <: 277, — oo < t < oo, define 

9 o(t) = f(t cos 9, t sin 0). 

Show that g e ( 0) = 0, gi( 0) = 0, g'e(0) — 2. Each g e has therefore a strict local 
minimum at t = 0. 

In other words, the restriction of / to each line through (0, 0) has a strict 
local minimum at (0, 0). 

(c) Show that (0, 0) is nevertheless not a local minimum for / since/(;c, x 2 ) = — x 4 . 

16. Show that the continuity of f at the point a is needed in the inverse function 
theorem, even in the case n = 1 : If 

/(/) = t + 2t 2 sin 

for 0, and /(0) = 0, then /'( 0) = 1, f is bounded in (—1, 1), but / is not 
one-to-one in any neighborhood of 0. 

17. Let f = (/i,/ 2 ) be the mapping of R 2 into R 2 given by 

A(x, y) = e x cos y, f 2 (x , y) = e x sin y. 

(a) What is the range of/? 

(b) Show that the Jacobian of /is not zero at any point of R 2 . Thus every point 
of R 2 has a neighborhood in which / is one-to-one. Nevertheless, / is not one-to- 
one on R 2 . 

(c) Put a = (0, 7t/ 3), b = /(a), let g be the continuous inverse of f, defined in a 
neighborhood of b, such that g(b) = a. Find an explicit formula for g, compute 
f'(a) and g'(b), and verify the formula (52). 

(d) What are the images under f of lines parallel to the coordinate axes? 

18. Answer analogous questions for the mapping defined by 

u — x 2 ~ y 2 , v = 2xy. 

19. Show that the system of equations 

3* + y — z + « 2 = 0 
* — y + 2z + « = 0 
2x + 2y — 3z -I- 2u = 0 

can be solved for x , y, u in terms of z; for x , z, u in terms of y; for y, z, u in terms 
of x; but not for x> y, z in terms of u. 

20. Take n = m = 1 in the implicit function theorem, and interpret the theorem (as 
well as its proof) graphically. 

21. Define /in R 2 by 

Ax , y) = 2x 3 - 3* 2 + 2y 3 + 3y 2 . 

(a) Find the four points in R 2 at which the gradient of /is zero. Show that /has 
exactly one local maximum and one local minimum in R 2 . 
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( b ) Let S be the set of all ( x , y) e R 2 at which /(x , y) = 0. Find those points of 
S that have no neighborhoods in which the equation /(x, y) = 0 can be solved for 
y in terms of x (or for x in terms of y). Describe S as precisely as you can. 

22. Give a similar discussion for 


f(x , y) = 2x 3 + 6 xy 2 - 3x 2 + 3 y 2 . 

23. Define /in R 3 by 


/(*, yi> yi) = + e x + y 2 . 

Show that /(0, 1, — 1) = 0, (£>i/)(0, 1, — 1) ^ 0, and that there exists therefore a 
differentiable function g in some neighborhood of (1,-1) in R 2 , such that 
g( 1, — 1) = 0 and 

f(g(yuyi\ yi,yi) = o. 


Find (D ig )( 1, -1) and (D 2 ^)(l, -1). 

24. For (x, y) # (0, 0), define f = (/i,/ 2 ) by 


/i (*, y) = 



x 2 + y 2 ’ 


/z(*, J') = 


* 2 + ,V 2 ' 


Compute the rank of f'(*, y), and find the range of f. 

25. Suppose A e L(R n , R m ), let r be the rank of A. 

(a) Define S as in the proof of Theorem 9.32. Show that SA is a projection in R n 
whose null space is Jf(A) and whose range is &(S). Hint: By (68), SASA =S A. 

( b ) Use (a) to show that 

dim Jf(A) + dim @(A) = n. 


26. Show that the existence (and even the continuity) of D l2 f does not imply the 
existence of D l f. For example, let /(x, y) = g(x) 9 where g is nowhere differentiable. 

27. Put /( 0, 0) = 0, and 


f(x 9 y) = 


xy(x 2 - y 2 ) 
x 2 + y 2 


if (x> y) 7^ (0, 0). Prove that 

(a) f 9 Dif ’, D 2 /are continuous in R 2 ; 

(b) D i2 f and D 2i f exist at every point of R 2 9 and are continuous except at (0, 0); 

(c) (D 12 /)(0, 0) = 1, and (D 21 /)( 0, 0) = - 1. 

28. For / ^ 0, put 

(x _ (0 <.x<.V't) 

<p(x,t)= 1-X + 2V} (V~t^x<.2Vt) 

{0 (otherwise), 


and put <p(x, /) = — <p(x, | / 1 ) if / < 0. 
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Show that <p is continuous on R 2 , and 


for all x. Define 


(D 2 <p)(x, 0) = 0 


/(0 = |_, 9(X, t)dx. 

Show that /(/) = / if | / 1 < J. Hence 

29. Let E be an open set in R H . The classes #'( E ) and E ) are defined in the text. 

By induction, tf lk) (E) can be defined as follows, for all positive integers k: To say 
that /e #<*>(£) means that the partial derivatives 0i/, . . . , A/belong to c 6 (k ~ l) (E). 

Assume / e# a) (£). and show (by repeated application of Theorem 9.41) 
that the /cth-order derivative 


Aid ••• ikf= DiiDu ... Di k f 

is unchanged if the subscripts ii, . . . , /* are permuted. 

For instance, if n >. 3, then 

-01213 /= 03112 / 

for every / e ^ (4) . 

30. Let / e # (m) (£), where E is an open subset of R n . Fix ae£, and suppose xe R n 
is so close to 0 that the points 


p(/) = a + /x 

lie in E whenever 0 < t <L 1 . Define 


Ko=nm) 

for all t e R l for which p(f ) e E. 

(a) For 1 <; k <, m, show (by repeated application of the chain rule) that 

A (k, w=2: (Ai ..w*/x mx n ...xi k . 

The sum extends over all ordered /c-tuples (/ x , . . . , /*) in which each ij is one of the 
integers 1 n. 

(b) By Taylor’s theorem (5.15), 

f/1 , Vi 1 ^ 0) , » m Kt) 

A( 1 )= *? »ir + iir 

for some / e (0, 1). Use this to prove Taylor’s theorem in n variables by showing 
that the formula 
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/( a + x)= £ ^Y(D il ... lk f)(a)x ll ■ ■■ x, k -\- r(x) 

k = o K ! 


represents /(a + x) as the sum of its so-called “Taylor polynomial of degree 
m — 1,” plus a remainder that satisfies 


Each of the inner sums extends over all ordered A:-tuples (/,, . . /'*), as in 
part (a); as usual, the zero-order derivative of / is simply /, so that the constant 
term of the Taylor polynomial of /at a is /(a). 

(c) Exercise 29 shows that repetition occurs in the Taylor polynomial as written in 
part ( b ). For instance, Z> 113 occurs three times, as Z>n 3 , />i 31, 1 1 - The sum of 

the corresponding three terms can be written in the form 

3 {D\ Z) 3 /)( a)*?* 3 . 

Prove (by calculating how often each derivative occurs) that the Taylor polynomial 
in ( b ) can be written in the form 


z 


(py ■ 


pi n fm 

* * Jn! 


x 


n • 


Here the summation extends over all ordered //-tuples (j 1# 5„) such that each 

Si is a nonnegative integer, and + - -r s„ < m — 1 . 

31. Suppose /e# <3) in some neighborhood of a point a c- R 2 , the gradient of / is 0 
at a, but not all second-order derivatives of f are 0 at a. Show how one can then 
determine from the Taylor polynomial of /at a (of degree 2) whether /has a local 
maximum, or a locaj minimum, or neither, at the point a. 

Extend this to R n in place of R 2 . 



10 

INTEGRATION OF DIFFERENTIAL FORMS 


Integration can be studied on many levels. In Chap. 6, the theory was developed 
for reasonably well-behaved functions on subintervals of the real line. In 
Chap. 1 1 we shall encounter a very highly developed theory of integration that 
can be applied to much larger classes of functions, whose domains are more 
or less arbitrary sets, not necessarily subsets of R n . The present chapter is 
devoted to those aspects of integration theory that are closely related to the 
geometry of euclidean spaces, such as the change of variables formula, line 
integrals, and the machinery of differential forms that is used in the statement 
and proof of the ^-dimensional analogue of the fundamental theorem of calculus, 
namely Stokes’ theorem. 


INTEGRATION 

10.1 Definition Suppose I k is a Ar-cell in R k , consisting of all 

X , Xfr) 


such that 

( 1 ) 


a^Xi^bi (/ = 1 , 
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V is the y-cell in R j defined by the first j inequalities (1), and / is a real con- 
tinuous function on /*. 

Put / = f k , and define /*_, on I k ~ 1 by 

c b k 

fk(x t , ...,x k . u x k )dx k . 

J a k 

The uniform continuity of f k on I k shows that f k _ x is continuous on l k ~\ 
Hence we can repeat this process and obtain functions f j9 continuous on I J \ such 
that is the integral of , with respect to ;c y , over [tf y , bj\ After k steps we 
arrive at a number f 0 , which we call the integral of f over /*; we write it in the 
form 

(2) f /(x) dx or f /. 

»' jk j j k 

A priori, this definition of the integral depends on the order in which the 
k integrations are carried out. However, this dependence is only apparent. To 
prove this, let us introduce the temporary notation L(f) for the integral (2) 
and L'(f) for the result obtained by carrying out the k integrations in some 
other order. 


10.2 Theorem For every f e #(/*), L(f) = L\f). 

Proof If /7(x) = h x { Aj) • • • h k (x k ), where hj e ^([^ y , 6 y ]), then 

m = n f\(-v.) dx, = L'(h). 

i — 1 J ai 

If srf is the set of all finite sums of such functions //, it follows that L(g) = 
L\g) for all g e srf . Also, srf is an algebra of functions on I k to which the 
Stone-Weierstrass theorem applies. 

k 

Put V — n ( bi — a f lf/e %>(I k ) and e > 0, there exists g e srf such 

that \\f—g\\ < s/V , where ||/|| is defined as max |/(x)| (x e I k ). Then 
\L(f-g)\ < 6, \L\f- g)\ < 6, and since 

L(f) - L\f) = L(f-g) + L\g -/), 

we conclude that | L(f) — L\f)\ < 2e. 

In this connection, Exercise 2 is relevant. 


10.3 Definition The support of a (real or complex) function / on R k is the 
closure of the set of all points x e R k at which /(x) # 0. If / is a continuous 
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function with compact support, let I k be any A:-cell which contains the support 
of / and define 

(3) f /=/ /• 

J Rk J Jk 

The integral so defined is evidently independent of the choice of I k , provided 
only that /* contains the support of /. 

It is now tempting to extend the definition of the integral over R k to 
functions which are limits (in some sense) of continuous functions with compact 
support. We do not want to discuss the conditions under which this can be 
done; the proper setting for this question is the Lebesgue integral. We shall 
merely describe one very simple example which will be used in the proof of 
Stokes’ theorem. 

10.4 Example Let Q k be the ^-simplex which consists of all points x = 

(.Yj x k ) in R k for which *, + •••+ x k < 1 and .v, > 0 for / = 1, . . . , k. If 

k = 3, for example, Q k is a tetrahedron, with vertices at 0, e l9 e 2 , e 3 . If/e #(£?*), 
extend / to a function on I k by setting/(x) = 0 off Q k , and define 

(4) f /=/ / 

J Qk J Jk 

Here I k is the “unit cube” defined by 

0 < Xj <1 (1 < / < k). 

Since / may be discontinuous on /*. the existence of the integral on the 
right of (4) needs proof. We also wish to show that this integral is independent 
of the order in which the k single integrations are carried out. 

To do this, suppose 0 < S < 1, put 

'1 (/ < 1 - S) 

(5) <p(t)= (1 — <5 < / < 1) 

,0 (1</), 

and define 

( 6 ) F(x) = <p(x x + • • ■ + x k )f(x) (x g /*). 

Then Fe #(/*). 

Put y = (.v^ **_,), x = (y, x k ). For each y e 7* _1 , the set of all x k 

such that F( y, x k ) ^ /( y; x k ) is either empty or is a segment whose length does 
not exceed S. Since 0 < tp < 1, it follows that 

|f*-.(y)-/*-i(y)l ^^ll/ll (ye/*' 1 ). 


( 7 ) 
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where ||/|| has the same meaning as in the proof of Theorem 10.2, and F k _ l9 
/*_! are as in Definition 10.1. 

As S ->0, (7) exhibits /*_, as a uniform limit of a sequence of continuous 
functions. Thus /*_, e ( 6(l k ~ *), and the further integrations present no problem. 
This proves the existence of the integral (4). Moreover, (7) shows that 


( 8 ) 


* F(x) dx - f /(x) dx\ < <5||/|| . 

jk J jk 


Note that (8) is true, regardless of the order in which the k single integrations 
are carried out. Since Fe^(/*), \F is unaffected by any change in this order. 
Hence (8) shows that the same is true of J/. 

This completes the proof. 

Our next goal is the change of variables formula stated in Theorem 10.9. 
To facilitate its proof, we first discuss so-called primitive mappings, and parti- 
tions of unity. Primitive mappings will enable us to get a clearer picture of the 
local action of a ^'-mapping with invertible derivative, and partitions of unity 
are a very useful device that makes it possible to use local information in a 
global setting. 


PRIMITIVE MAPPINGS 

10.5 Definition If G maps an open set E c= R n into /?", and if there is an 
integer m and a real function g with domain E such that 

(9) G(x) = X x i e i + #(*)e m (x e £), 

i 

then we call G primitive. A primitive mapping is thus one that changes at most 
one coordinate. Note that (9) can also be written in the form 

(10) G(x) = x + te(x) - *Je m • 

If g is differentiable at some point a e E, so is G. The matrix [a,-,] of the 
operator G'(a) has 

(H) (A^Xa), . . . , (D m g)( a), . . . , (D n g)( a) 

as its mth row. For j # m, we have otjj = 1 and a = 0 if / # j. The Jacobian 
of G at a is thus given by 

(12) y c ( a) = det [G '(a)] = (£> m ^)(a), 

and we see (by Theorem 9.36) that G'(a) is invertible if and only if ( D m g)( a) # 0. 
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10.6 Definition A linear operator B on R n that interchanges some pair of 
members of the standard basis and leaves the others fixed will be called a flip. 

For example, the flip B on /? 4 that interchanges e 2 and e 4 has the form 

(13) B(x x e, + x 2 e 2 + ,v 3 e 3 + .v 4 e 4 ) = .v, e, + .v 2 e 4 + .v 3 e 3 + .v 4 e 2 

or, equivalently, 

(14) B(x t e, + x 2 e 2 + v 3 e 3 + ,v 4 e 4 ) = .v, e, + ,v 4 e 2 + ,v 3 e 3 + x 2 e 4 . 

Hence B can also be thought of as interchanging two of the coordinates, rather 
than two basis vectors. 

In the proof that follows, we shall use the projections P 0 P n in R n , 

defined by P 0 x = 0 and 

(15) P m x = .v 1 e 1 + ••• + x m e m 

for 1 < m < n. Thus P m is the projection whose range and null space are 
spanned by {e 1 e„,} and fe,„ + j, . . . , ej, respectively. 

10.7 Theorem Suppose F is a % mapping of an open set E c= R" into R'\ 0e£, 
F(0) = 0, and F'(0) is invertible. 

Then there is a neighborhood of 0 in R n in which a representation 

(16) F(x) = By £„-jG„ - • • * c Gj(x) 
is valid. 

In (16), each G , is a primitive '//-mapping in some neighborhood of 0; 
G,.(0) = o, g;.(0) is invertible , and each B i is either a flip or the identity operator. 

Briefly, (16) represents F locally as a composition of primitive mappings 
and flips. 

Proof Put F = Fj. Assume 1 </?/<//— 1, and make the following 
induction hypothesis (which evidently holds for m = 1): 

V m is a neighborhood of 0, F„, e V'(V,„) = 0, F'„(0) is invertible , 

and 

(17) P m - l FJ.x) = P m . l x (x e VJ. 

By (17), we have 

(18) F„,(x) = x + f a,(x)e, , 

i = in 

wheie a m , . . . , are real ft '-functions in V m . Hence 
T«(0)e,„ = X (A,.*,)(°)e, - 


( 19 ) 
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Since F^(0) is invertible, the left side of (19) is not 0, and therefore there 
is a A: such that m < k <n and (D m a*)(0) # 0. 

Let B m be the flip that interchanges m and this k (if k = m, B m is the 
identity) and define 

(20) G m (x) = x + [a*(x) - x m ]e m (x e VJ. 

Then G m e^'(FJ, G m is primitive, and G^,(0) is invertible, since 
(D m a k X0)*O. 

The inverse function theorem shows therefore that there is an open 
set U m , with 0 e U m a V m , such that G m is a 1-1 mapping of U m onto a 
neighborhood V m+l of 0, in which G” 1 is continuously differentiable. 
Define F m+1 by 

(21) F m+1 (y) = J S m F m oG; , (y) (yeF m+1 ). 

Then F m+1 e <#'(V m + l ), F m+I (0) = 0, and F^ +1 (0) is invertible (by 
the chain rule). Also, for x e U m , 

(22) P m F m + , (G m (x)) = P m B m F m (x) 

= Pm[ p n,-i* + + ••] 

= Pm - 1 x + a*(x)e m 

= P m G m (x) 


so that 


(23) P m F m + 1 (y) — P m y (ye V m + l ). 

Our induction hypothesis holds therefore with /w + 1 in place of m. 

[In (22), we first used (21), then (18) and the definition of B m , then 
the definition of P m , and finally (20).] 

Since B m B m = /, (21), with y = G m (x), is equivalent to 

(24) F m (x) = £ m F m + 1 (G„(x)) (x e UJ. 

If we apply this with m = 1, ...,«— 1 , we successively obtain 

F = Fj = B x F 2 o G j 

— B^B 2 F3oG2 0 Gj — ' ’ * 

— _ i F n o G n _! o-'oGj 


in some neighborhood of 0. By (17), F„ is primitive. This completes the 
proof. 
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PARTITIONS OF UNITY 

10.8 Theorem Suppose K is a compact subset of R n , and {V a } is an open cover 
of K. Then there exist functions i e ( 6(R n ) such that 

(i a ) 0 <(//, < 1 for 1 < / < s; 

(b) each i/f, has its support in some K a , and 

(c) if/fx) + • • • + ij/fx) = 1 for every x e K . 

Because of (c), {^,} is called a partition of unity , and ( b ) is sometimes 
expressed by saying that {i /^} is subordinate to the cover { V a } . 

Corollary Iffe c 6(R n ) and the support of f lies in K, then 

(25) /= t ihf. 

i= 1 

Each \\) J has its support in some V a . 

The point of (25) is that it furnishes a representation of / as a sum of 
continuous functions \\t J with “small” supports. 

Proof Associate with each x e K an index 3(x) so that x e V a(x) . Then 
there are open balls i?(x) and B'(x), centered at x, with 

(26) BW c W(x) cl^jc V<x(x) • 

Since K is compact, there are points x,, . . . , x 5 in K such that 

(27) K a B(x } ) u • • • u B(x s ). 

By (26), there are functions <p,, (p s e'6(R n ), such that </>,(x) = 1 on 

B(Xi), <Pj(x) = 0 outside H (x,), and 0 < c p,(x ) < 1 on R n . Deline (//, = cp x 
and 

(28) lAi + i = (1 - <Pi) ■ ' ' (1 ~ <Pi)<Pi + i 

for / = 1 s — 1 . 

Properties (<?) and (6) are clear. The relation 

(29) V'i + +<ffi= 1 -(1 -</>i)-"0 - (Pi) 

is trivial for / = 1. If (29) holds for some / < s, addition of (28) and (29) 
yields (29) with / 4- 1 in place of /. It follows that 

(30) i i^x) = i - fi [i - <Pi(x)i (x e r "). 

i = 1 i'=l 

If x e K, then x e i?(x,) for some /, hence (p,(x) = 1, and the product in 
(30) is 0. This proves (c). 
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CHANGE OF VARIABLES 

We can now describe the effect of a change of variables on a multiple integral. 
For simplicity, we* confine ourselves here to continuous functions with compact 
support, although this is too restrictive for many applications. This is illustrated 
by Exercises 9 to 13. 

10.9 Theorem Suppose T is a 1-1 %' -mapping of an open set E c= R k into R k 
such that J T (x) # 0 for all x e E. If f is a continuous function on R k whose support 
is compact and lies in T (£), then 

(31) f f{y)dy=\ f(T(x))\J r (x)\ dx. 

j R k j R k 

We recall that J T is the Jacobian of T. The assumption y 7 (x) ^ 0 implies, 
by the inverse function theorem, that T~ l is continuous on T(E), and this 
ensures that the integrand on the right of (31) has compact support in E 
(Theorem 4.14). 

The appearance of the absolute value of J T {x) in (31) may call for a com- 
ment. Take the case k = 1, and suppose T is a 1-1 ^'-mapping of R l onto R ] . 
Then J T {: x) = T\x)\ and if T is increasing , we have 

(32) f /O’) dy = f f(T(x))T\x)dx , 

j r t 

by Theorems 6.19 and 6.17, for all continuous/ with compact support. But if 
T decreases, then T\x) <0; and if /is positive in the interior of its support, 
the left side of (32) is positive and the right side is negative. A correct equation 
is obtained if T' is replaced by | T'\ in (32). 

The point is that the integrals we are now considering are integrals of 
functions over subsets of R k , and we associate no direction or orientation with 
these subsets. We shall adopt a different point of view when we come to inte- 
gration of differential forms over surfaces. 

Proof It follows from the remarks just made that (31) is true if T is a 
primitive ^'-mapping (see Definition 10.5), and Theorem 10.2 shows 
that (31) is true if T is a linear mapping which merely interchanges two 
coordinates. 

If the theorem is true for transformations P, Q , and if S(x) = P(0(x)), 

then 

J7(z) dz = |/(/ , (y))|Vy)| dy 

= jf(P(Q(x)))\Jp(Q(x))\ |y e (x)| dx 
= |/(S(x))|y s (x)| dx. 
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since 

J p (Q(x))J q (x) = det P\Q(x)) det Q'(x) 

= det P'(Q(x))Q\x) = det S'(x) = J s (x), 

by the multiplication theorem for determinants and the chain rule. Thus 
the theorem is also true for S. 

Each point a e E has a neighborhood U cz E in which 

(33) T(x) = m + i?| • * • B k .,G k o G*_, o • • • 3 G,(x-a), 

where G, and are as in Theorem 10.7. Setting V = T(U ), it follows 
that (31) holds if the support of /lies in V. Thus: 

Each point y e T (E) lies in an open set V y c - T (E) such that (3 1 ) holds 
for all continuous functions whose support lies in V y . 

Now let /be a continuous function with compact support K cz T(E). 
Since {V y } covers K, the Corollary to Theorem 10.8 shows that / = lipf', 
where each is continuous, and each (//, has its support in some V y . 
Thus (31) holds for each ip ,/, and hence also for their sum / 


DIFFERENTIAL FORMS 

We shall now develop some of the machinery that is needed for the //-dimen- 
sional version of the fundamental theorem of calculus which is usually called 
Stokes' theorem. The original form of Stokes’ theorem arose in applications of 
vector analysis to electromagnetism and was stated in terms of the curl of a 
vector field. Green's theorem and the divergence theorem are other special 
cases. These topics are briefly discussed at the end of the chapter. 

It is a curious feature of Stokes' theorem that the only thing that is difficult 
about it is the elaborate structure of definitions that are needed for its statement. 
These definitions concern differential forms, their derivatives, boundaries, and 
orientation. Once these concepts are understood, the statement of the theorem 
is very brief and succinct, and its proof presents little difficulty. 

Up to now we have considered derivatives of functions of several variables 
only for functions defined in open sets. This was done to avoid difficulties that 
can occur at boundary points. It will now be convenient, however, to discuss 
differentiable functions on compact sets. We therefore adopt the following 
convention: 

To say that f is a ^'-mapping (or a ^"-mapping) of a compact set 
D c= R k into R n means that there is a ^'-mapping (or a £ "-mapping) g of 
an open set W cz R k into R n such that D a W and such that g(x) = f(x) for 
all x e D. 
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10.10 Definition Suppose E is an open set in R n . A k-surface in £ is a ( € > - 
mapping O from a compact set D c= R k into E. 

D is called the parameter domain of <1>. Points of D will be denoted by 
u = (u u u k ). 

We shall confine ourselves to the simple situation in which D is either a 
Ar-cell or the /:-simplex Q k described in Example 10.4. The reason for this is 
that we shall have to integrate over Z), and we have not yet discussed integration 
over more complicated subsets of R k . It will be seen that this restriction on D 
(which will be tacitly made from now on) entails no significant loss of generality 
in the resulting theory of differential forms. 

We stress that k- surfaces in E are defined to be mappings into Zf, not 
subsets of E . This agrees with our earlier definition of curves (Definition 6.26). 
In fact, 1 -surfaces are precisely the same as continuously differentiable curves. 

10.11 Definition Suppose E is an open set in R n . A differential form of order 
k >1 in E (briefly, a k-form in E) is a function co, symbolically represented by 
the sum 

(34) co = £ «i, • • • ,'„(*) dx it a • • • a dx ik 

(the indices i l9 ..., i k range independently from 1 to n), which assigns to each 
A:-surface O in E a number co(<J>) = co, according to the rule 

(35) | co = J X a i, ' ' ' i.W”)) v"’ 

where D is the parameter domain of d>. 

The functions a h . .. ik are assumed to be real and continuous in E. If 
are the components of d>, the Jacobian in (35) is the one determined 
by the mapping 

(«1 «fc) — ► «) <£,»)• 

Note that the right side of (35) is an integral over Z), as defined in Defini- 
tion 10.1 (or Example 10.4) and that (35) is the definition of the symbol co. 

A /c-form co is said to be of class or %" if the functions a (l . . . ik in (34) 
are all of class <€' or 

A 0-form in E is defined to be a continuous function in E. 

10.12 Examples 

(a) Let y be a 1-surface (a curve of class #') in 7? 3 , with parameter 
domain [0, 1]. 

Write (x, y , z) in place of (jc, , x 2 , * 3 ), and put 
c o = x dy + y dx. 
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( 36 ) 


Then 

Jffl = J o ' t?i(0?2(0 +72(0?'.(0] dt = yi(l)y2(I) - 7, (0)72(0). 

Note that in this example J y a> depends only on the initial point y(0) 
and on the end point y(l) of y. In particular, J y a> = 0 for every closed 
curve y. (As we shall see later, this is true for every 1-form a> which is 
exact.) 

Integrals of 1 -forms are often called line integrals. 

( b ) Fix a > 0, b > 0, and define 

y(t ) = (a cos t , b sin / ) (0 < t < In), 

so that y is a closed curve in ft 2 . (Its range is an ellipse.) Then 

r- /* 2 7T 

jc dy = I ab cos 2 t dt = nab , 

J y ' J o 

whereas 

I y dx = — I ab sin 2 t dt = — nab. 

J y Jo 

Note that J y .v dy is the area of the region bounded by y. This is a 
special case of Green s theorem. 

(c*) Let D be the 3-cell defined by 

0 < r < 1 , 0 < 0 < 7T, 0 < (p < 2n. 

Define d>(r, 0 , ip) = ( v, z). where 

x = r sin 0 cos <p 
y = r sin 0 sin tp 
z = r cos 9. 


Then 


0, (p) = = r 2 sin 0. 

a(r, 0, q>) 

Hence 

f dx a dy a dz = [ J<p = ~ 

J D 3 

Note that O maps Z) onto the closed unit ball of ft 3 , that the mapping 
is 1-1 in the interior of D (but certain boundary points are identified by 
O), and that the integral (36) is equal to the volume of <&(D). 
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10.13 Elementary properties Let co, co l , co 2 be fc-forms in E. We write co { = c o 2 
if and only if co^O) = co 2 (0) for every ^-surface O in E. In particular, co = 0 
means that co(O) = 0 for every ^-surface O in E. If c is a real number, then 
cco is the fc-form defined by 

(37) f cco = c f co, 
and co = g)j + co 2 means that 

(38) f co = f co, + I co 2 

» (J) v jJ, J 

for every ^-surface ® in E , As a special case of (37), note that — co is defined so 
that 

(39) f (-co) = - f dco. 

Consider a fc-form 

(40) co = a(\) dx u a • • • a dx ik 

and let co be the A> form obtained by interchanging some pair of subscripts in 

(40) . If (35) and (39) are combined with the fact that a determinant changes 
sign if two of its rows are interchanged, we see that 

(41) co = —co. 

As a special case of this, note that the anticommutative relation 

(42) dx t a dxj = —dxj a dx { 
holds for all i and j. In particular, 

(43) dx t a dx t = 0 (i = 1 n). 

More generally, let us return to (40), and assume that i r = i s for some 
r ^ s. If these two subscripts are interchanged, then co = co, hence co = 0, by 
(41). 

In other words, if co is given by (40), then co = 0 unless the subscripts 
i u . . . , i k are all distinct. 

If co is as in (34), the summands with repeated subscripts can therefore 
be omitted without changing co. 

It follows that 0 is the only /:-form in any open subset of R n , if k > u. 
The anticommutativity expressed by (42) is the reason for the inordinate 
amount of attention that has to be paid to minus signs when studying differential 
forms. 
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10.14 Basic A>forms If i u . .., i k are integers such that 1 < i x < i 2 < ••• 
< i k < n, and if I is the ordered fc-tuple {/\, . . . , /*}, then we call I an increasing 
k-index, and we use the brief notation 

(44) dx t = dx h a • • • a dx ik . 

These forms dxj are the so-called basic k-forms in R n . 

It is not hard to verify that there are precisely n\/k\(n — k)\ basic fc-forms 
in R n \ we shall make no use of this, however. 

Much more important is the fact that every fc-form can be represented in 
terms of basic fc-forms. To see this, note that every k- tuple {j\, . . . ,j k } of distinct 
integers can be converted to an increasing A>index J by a finite number of inter- 
changes of pairs; each of these amounts to a multiplication by —1, as we saw 
in Sec. 10.13; hence 

(45) dx u a • • • a dx Jk = e( j u . . .,j k ) dx } 

where e(j\, . ..,y*) is 1 or —1, depending on the number of interchanges that 
are needed. In fact, it is easy to see that 

(46) e(h, ■•■>jk) = s(j\ J'k) 

where s is as in Definition 9.33. 

For example, 

dx x a dx 5 a dx 3 a dx 2 = — dx x a dx 2 a dx 3 a dx 5 
and 


dx 4 a dx 2 a dx 3 = dx 2 a dx 3 a dx 4 . 

If every /;-tuple in (34) is converted to an increasing k- index, then we 
obtain the so-called standard presentation of co: 

(47) « = £ b,(x) dxj . 

1 

The summation in (47) extends over all increasing ^-indices I. [Of course, every 
increasing fc-index arises from many (from k\ , to be precise) fc-tuples. Each 
bj in (47) may thus be a sum of several of the coefficients that occur in (34).] 
For example, 

x i dx 2 a dXi — x 2 dx 3 a dx 2 + x 3 dx 2 A dx 3 -I- dx x a dx 2 

is a 2-form in R 3 whose standard presentation is 

(1 — Xi) dx x a dx 2 + ( x 2 + x 3 ) dx 2 a dx 3 . 

The following uniqueness theorem is one of the main reasons for the 
introduction of the standard presentation of a fc-form. 
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10.15 Theorem Suppose 

(48) co = X bj{\) dx r 

/ 

is the standard presentation of a k-form co in an open set E c= R n . If co = 0 in E , 
then b r (x) = 0 for every increasing k-index I and for every x e E. 

Note that the analogous statement would be false for sums such as (34), 
since, for example, 

dx x a dx 2 + dx 2 a dx x = 0. 

Proof Assume, to reach a contradiction, that b 3 { v) > 0 for some v e E 
and for some increasing A:-index J = {j u . . . 9 j k }. Since bj is continuous, 
there exists h > 0 such that bj(x) > 0 for all xe R n whose coordinates 
satisfy | x t — v t \ <h. Let D be the A:-cell in R k such that u e D if and 
only if | u r | < h for r = 1, . . . , k. Define 

(49) 3>(u) = v + X u,e Jr (u e D). 

r— 1 

Then O is a A:-surface in E , with parameter domain Z), and bj(Q>(u)) > 0 
for every u e D. 

We claim that 


(50) f co= f bj((t>(u))du. 

Jq Jd 

Since the right side of (50) is positive, it follows that a>(0) ^ 0. Hence 
(50) gives our contradiction. 

To prove (50), apply (35) to the presentation (48). More specifically, 
compute the Jacobians that occur in (35). By (49), 

d(*j. xj L 

d(u lt ...,u k ) 

For any other increasing A:-index I ^ J, the Jacobian is 0, since it is the 
determinant of a matrix with at least one row of zeros. 


10.16 Products of basic A>forms Suppose 

(51) /={i „ J = {j u 

where 1 < i x < • • • < i p < n and 1 < j x < • • • <j q < n. The product of the cor- 
responding basic forms dx t and dxj in R n is a (p 4- < 7 )-form in R n , denoted by 
the symbol dxj a dxj , and defined by 

(52) dxj a dxj = dx ix a • • • a dx ip a dx jt a • • • a dx jq . 
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If 7 and 7 have an element in common, then the discussion in Sec. 10.13 
shows that dx t a dxj = 0. 

If / and J have no element in common, let us write [/, 7] for the increasing 
( p + < 7 )-index which is obtained by arranging the members of 7 u J in increasing 
order. Then dx u ^ is a basic ( p + ^)-form. We claim that 

(53) dxj A dxj = (-1 Y dx u ^ 

where a is the number of differences j t — i s that are negative. (The number of 
positive differences is thus pq — a.) 

To prove (53), perform the following operations on the numbers 

(54) 

Move i p to the right, step by step, until its left neighbor is less than i p . The 
number of steps is the number of subscripts t such that i p < j t . (Note that 0 
steps are a distinct possibility.) Then do the same for / p _,, ..., /, . The total 
number of steps taken is a. The final arrangement reached is [7,7]. Each step, 
when applied to the right side of (52), multiplies dx t a dx d by —1. Hence (53) 
holds. 

Note that the right side of (53) is the standard presentation of dxj a dx d . 
Next, let K = (k x , . . . , k r ) be an increasing r-index in {1, . . . , n). We shall 
use (53) to prove that 

(55) (dxj a dxj) a dx K = dxj a ( dx 3 a dx K ). 

If any two of the sets 7, 7, K have an element in common, then each side 
of (55) is 0, hence they are equal. 

So let us assume that 7, 7, K are pairwise disjoint. Let [7, 7, K] denote 
the increasing (p + q 4- r)-index obtained from their union. Associate ft with 
the ordered pair (7, K) and y with the ordered pair (7, K) in the way that a was 
associated with (7, 7) in (53). The left side of (55) is then 

(“ 1 ) a d x [i, j ] A d*K = (“ 1)*( ~ 1 Y + y dx^i j 

by two applications of (53), and the right side of (55) is 

(-1 Y dx, A dx [JK j= (-lY(-iy +y dxU'jxy 
Hence (55) is correct. 

10.17 Multiplication Suppose a> and A are /?- and ^-forms, respectively, in 
some open set E a R n , with standard presentations 

(56) co = X b,(x) dx,, X = X o(x) dx, 

I J 

where 7 and 7 range over all increasing p - indices and over all increasing ^-indices 
taken from the set {1, ... , n). 
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Their product, denoted by the symbol co a A, is defined to be 

(57) co a A = £ b/(x)Cj(x) dxj a dxj . 

i.J 

In this sum, / and J range independently over their possible values, and dxj a dxj 
is as in Sec. 10.16. Thus co a A is a (p 4- ^)-form in £. 

It is quite easy to see (we leave the details as an exercise) that the distribu- 
tive laws 


and 


(COj + (X>2 ) A A = (CUj A A) + (C0 2 A A) 


CO A (Aj A. 2 ) — (CO A Aj) -j- (CO A A 2 ) 

hold, with respect to the addition defined in Sec. 10.13. If these distributive 
laws are combined with (55), we obtain the associative law 

(58) (co a!)a(j = coa(aa(j) 

for arbitrary forms co, A, o in £. 

In this discussion it was tacitly assumed that p > 1 and q ^ 1. The product 
of a 0-form / with the /?-form co given by (56) is simply defined to be the p-form 

fio = (t)f= £/(x)6,(x) dx,. 

I 

It is customary to write /co, rather than / a co, when / is a 0-form. 


10.18 Differentiation We shall now define a differentiation operator d which 
associates a (A: + l)-form dco to each k-f orm co of class in some open set 
E a R n . 

A 0-form of class in E is just a real function f e^\E), and we define 

(59) df='£(DJ)(x)dx i . 

i— 1 

If co = I.b [ (\)dx [ is the standard presentation of a A:- form co, and b r eW(E) 
for each increasing &-index /, then we define 

(60) dco = £ (</£ /) a dx j . 

/ 

10.19 Example Suppose £ is open in £", / e #'(£), and y is a continuously 
differentiable curve in £, with domain [0, 1]. By (59) and (35), 

f df= f‘ i (D;f)(y{t ))y[{t)dt. 

J y J Q j= 1 


( 61 ) 
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By the chain rule, the last integrand is (/ ° y)'(r). Hence 

(62) f df=/(y( 1)) -/()<0)), 

J y 

and we see that J y dj is the same for all y with the same initial point and the same 
end point, as in (a) of Example 10.12. 

Comparison with Example 1 0. 1 2(6) shows therefore that the 1-form x dy 
is not the derivative of any 0-form /. This could also be deduced from part ( b ) 
of the following theorem, since 

d(x dy) = dx a dy ^ 0. 


10.20 Theorem 

(a) If cd and X are k- and m-forms , respectively , of class in £, then 

(63) d(CD A X) = (i d(D ) A X + ( — \) k CD a dX. 

(b) If cd is of class in E , then d 2 co = 0. 

Here d 2 CD means, of course, d(dco). 

Proof Because of (57) and (60), (a) follows if (63) is proved for the 
special case 

(64) oj=fdxj , X = gdxj 

where /, ge^'(E), dx j is a basic £-form, and dx 3 is a basic m-form. [If 
k or m or both are 0, simply omit dx T or dx 3 in (64); the proof that follows 
is unaffected by this.] Then 

cd a X = fg dx j a dxj . 

Let us assume that / and J have no element in common. [In the other 
case each of the three terms in (63) is 0.] Then, using (53), 

d( 0 ) a 2) = d(fg dx, a dx,) =(- 1)“ d{fg dx u J} ) . 

By (59), d(fg) =fdg + gdf Hence (60) gives 

d(cD a X) = (- l) a ( fdg 4- gdf) a dx {I J] 

= ( 9 d f + f d 9) a dx, a dxj . 

Since dg is a 1-form and dx , is a /r-form, we have 
dg a dx , = ( - 1)* dx r a dg , 
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by (42). Hence 

d(a> A />) = (df A dx f ) a (g dxj) + ( - 1 )\fdx,) a ( dg a dxj) 

— (dco) a A + (— l)*co a dX, 

which proves (a). 

Note that the associative law (58) was used freely. 

Let us prove ( b ) first for a 0-form f e < € n \ 

d V= d(t(Djf)(x) dx) 

= Z d(Djf) A dxj 

j= i 

= Z ( D ijf)(*)dXi A 

i, J = 1 

Since D tj f = D jt f (Theorem 9.41) and dx t a dx } = —dxj a , we see 
that d 2 / = 0. 

If co =fdx T , as in (64), then = (^) a By (60), d(dxj) = 0. 
Hence (63) shows that 

d 2 to = ( d 2 f ) a dxj = 0. 

10.21 Change of variables Suppose E is an open set in R n , T is a mapping 
of E into an open set V a R m , and a> is a A:-form in V, whose standard presenta- 
tion is 

(65) o)=Y / b r (y)dy I . 

I 

(We use y for points of V, x for points of E.) 

Let t u . . . , t m be the components of T: If 

y = (jl.-- <Jm) = 7'(x) 

then y { = r 4 (x). As in (59), 

(66) dt t = Z (Dj /.)(x) dxj (I < / < m). 

i= i 

Thus each dt { is a 1-form in E. 

The mapping T transforms to into a /:-form to T in £, whose definition is 

(67) co T = £ b r (T(x)) dt h a • • • a . 

In each summand of (67), / = {/ j, . . . , 4} is an increasing A-index. 

Our next theorem shows that addition, multiplication, and differentiation 
of forms are defined in such a way that they commute with changes of variables. 
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10.22 Theorem With E and T as in Sec. 10.21, let co and k be k - and m-forms 
in V , respectively. Then 

(a) ((o "f — (Of -f kf if k — m\ 

(b) (CO A —— (Of A kf , 

(c) d(co T ) = (d(o) T if co is of class and T is of class c €". 

Proof Part (a) follows immediately from the definitions. Part (b) is 
almost as obvious, once we realize that 

(68) (dy n a • • • a dy ir ) T = dt h a ■■■ a dt ir 

regardless of whether {/ l5 . . ., / r } is increasing or not; (68) holds because 
the same number of minus signs are needed on each side of (68) to produce 
increasing rearrangements. 

We turn to the proof of (c). If/ is a 0-form of class in K, then 
/ T (x) = f(T (x)), df= X (Dj)(y) dy t . 

i 

By the chain rule, it follows that 

(69) d(f T ) = X (Djf T )(\) dxj 

j 

= y'L(D i f)(T(x))(D J t i )(x)dx J 
j i 

= X (DJ)(T(X)) dt, 

i 

= (df) T . 

If dyj = dy h a • • • a dy ik , then ( dy r ) T = dt ix a • • • a dt ik , and Theorem 
10.20 shows that 

(70) d((dy j) T ) = 0. 

(This is where the assumption T e is used.) 

Assume now that co = / dy r . Then 

= /r( x ) (dyr)r 

and the preceding calculations lead to 

d((o T ) = ^C/r) A (dy i)t = (^)r A ( dy r ) T 
= ((c//) a dy r ) T = (doj) T . 

The first equality holds by (63) and (70), the second by (69), the third by 
part (Z>), and the last by the definition of dco. 

The general case of (c) follows from the special case just proved, if 
we apply (a). This completes the proof. 
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Our next objective is Theorem 10.25. This will follow directly from two 
other important transformation properties of differential forms, which we state 
first. 

10.23 Theorem Suppose T is a -mapping of an open set £ c= R n into an open 
set V c= R m , S is a W -mapping of V into an open set W <= R p , and co is a k-form 
in W , so that co s is a k-form in V and both ( co s ) T and co ST are k-forms in £, where 
ST is defined by (ST)(x) = S(T(x)). Then 

(71) (u> s ) T = a) ST - 

Proof If co and k are forms in W , Theorem 10.22 shows that 

((co a A) 5 ) r = (co s a A$) r = (co s ) r a (k s ) T 

and 


(CO A 2)sr — w ST A ^ST • 

Thus if (71) holds for co and for A, it follows that (71) also holds for co a k. 
Since every form can be built up from 0-forms and 1 -forms by addition 
and multiplication, and since (71) is trivial for 0-forms, it is enough to 
prove (71) in the case co = dz q , q = 1, (We denote the points of 

£, V , W by x, y, z, respectively.) 

Let t u . . . , t m be the components of T, let s u . . . , s p be the compo- 
nents of S , and let r l9 . . . , r p be the components of ST. If co = dz qi then 

(Os = ds q = Y. (DjS q )(y) dyj, 
j 

so that the chain Tule implies 

(«s)r = I (DjS q )(T(xj) dtj 
j 

= l(VjS q )(T(x))'L(D i t J )(x)dx i 
j i 

= X (A r «X x ) dx i = dr t = u> ST . 

i 

10.24 Theorem Suppose co is a k-form in an open set £<=£”, <I> is a k-surface 
in £, with parameter domain D a R k , and A is the k-surface in R k , with parameter 
domain D , defined by A(u) = u(u e D). Then 

J CO = J CO*, . 

•'a 

Proof We need only consider the case 

co = tf(x) dx h a • • • a dx ik . 
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If </>,, . . . , <p n are the components of <D, then 

co* = a(<t>(u)) d(j) ix a • • • a d<f > ik . 

The theorem will follow if we can show that 
(72) d(t) l{ a ••• a d<f) ik = J(u) du x a ••• a du k , 

where 


since (72) implies 


■/(□) = 


5(*,y .... 

U k ) ’ 


f co = f a((t>(u))J(u) du 

= f C7(d)(u))y(u) A • • • A du k = I CO*. 
J A J A 

Let [/I] be the A by A matrix with entries 


Then 


so that 


x(p. q) = (D q <l> ip )( u) (p, q = 1, A). 


<>4>i p = I «(/>. 9) ^ 

<7 


d<j) h a • • • A d<t> ik = £ a(l . ?i) • • • a (*, <7*) du qi A ■■■ A du qk . 

In this last sum, q { , . . . , q k range independently over 1 , . . . , A. The anti- 
commutative relation (42) implies that 

du qi a * * * a du qk = s{q u ...,q k )du x a • • • a du k , 

where 5 is as in Definition 9.33; applying this definition, we see that 

d(t) ix a • • • a d(p ik = det [A] du x a • • • a du k ; 

and since J(u) = det [/4], (72) is proved. 

The final result of this section combines the two preceding theorems. 


10.25 Theorem Suppose T is a <6' -mapping of an open set E c= R n into an open 
set V c = R m , d> is a k-surface in E, and co is a k-form in V. 

Then 
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Proof Let D be the parameter domain of <t> (hence also of TO) and 
define A as in Theorem 10.24. 

Then 


<« = = («r)® = . 

J T<t> J <t> 

The first of these equalities is Theorem 10.24, applied to TO in place of O. 
The second follows from Theorem 10.23. The third is Theorem 10.24, 
with co T in place of c o. 


SIMPLEXES AND CHAINS 

10.26 Affine simplexes A mapping f that carries a vector space X into a 
vector space Y is said to be affine iff — f(0) is linear. In other words, the require- 
ment is that 

(73) f(x) = f(0) 4- Ax 
for some A e L(X , Y). 

An affine mapping of R k into R n is thus determined if we know f(0) and 
f(e,) for 1 <i<k\ as usual, {e,, . . . , e*} is the standard basis of R k . 

We define the standard simplex Q k to be the set of all u e R k of the form 

(74) “ = Z a i e i 

1 = 1 

such that a, > 0 for / = 1, . . . , k and la, < 1 . 

Assume now that p 0 , p ls ..., p fc are points of R n . The oriented affine 
k-s implex 

(75) a = [p 0 , p„ p*] 

is defined to be the A:-surface in R” with parameter domain Q k which is given 
by the affine mapping 

k 

(76) CT(a,e, + • • • + a* e*) = p 0 + I a,(p, - Po)- 

1= 1 

Note that a is characterized by 

(77) <t( 0) = po , <r(e,) = p, (for 1 ^ / < k), 

and that 

(78) <t(u) = Po + Au (u e Q k ) 
where A e L(R k , R") and Ae t = p, — p 0 for 1 < i < k. 
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We call o oriented to emphasize that the ordering of the vertices p 0 , . . . , p* 
is taken into account. If 

( 79 ) o=[p i0 ,p h ,...,p ik ], 

where {/ 0 , / 1 , . . . , i k ] is a permutation of the ordered set {0, 1, . . . , k }, we adopt 
the notation 

(80) d = s(i 0 , i k )<7, 

where s is the function defined in Definition 9.33. Thus d = ±<r, depending on 
whether s = 1 or 5 = — 1. Strictly speaking, having adopted (75) and (76) as 
the definition of g , we should not write d = o unless / 0 = 0, . . . , i k = k , even 

if .v(/ 0 i k ) = 1 ; what we have here is an equivalence relation, not an equality. 

However, for our purposes the notation is justified by Theorem 10.27. 

If d = cg (using the above convention) and if e = 1, we say that d and g 
have the same orientation ; if c = — 1, d and o are said to have opposite orienta- 
tions. Note that we have not defined what we mean by the “orientation of a 
simplex." What we have defined is a relation between pairs of simplexes having 
the same set of vertices, the relation being that of “having the same orientation.” 

There is, however, one situation where the orientation of a simplex can 
be defined in a natural way. This happens when n = k and when the vectors 
p, — p 0 (1 < / < k) are independent. In that case, the linear transformation A 
that appears in (78) is invertible, and its determinant (which is the same as the 
Jacobian of o) is not 0. Then a is said to be positively (or negatively) oriented if 
det A is positive (or negative). In particular, the simplex [0, e 1? ..., e k ] in R k , 
given by the identity mapping, has positive orientation. 

So far we have assumed that k > 1. An oriented 0-simplex is defined to 
be a point with a sign attached. We write o = + p 0 or o = — p 0 . If a = 6p 0 
(e = ±1) and if/ is a 0-form (i.e., a real function), we define 

f /= ef(Po)- 

J a 


10.27 Theorem If o is an oriented rectilinear k-simplex in an open set E a R n 
and if g = f.g then 

( 81 ) J o) = e J co 

for every k-form a> in E. 

Proof For k = 0, (81) follows from the preceding definition. So we 
assume k > 1 and assume that o is given by (75). 
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Suppose 1 < j < k, and suppose cr is obtained from a by inter- 
changing p 0 and p j . Then e = — 1 , and 

<t(u) = p, + Bu (u e Q k ), 

where B is the linear mapping of R k into R n defined by fie, = p 0 — p, , 
Be t = P/ — Py if 1 # j • If we write Ae { = x, (1 < / < k ), where A is given 
by (78), the column vectors of B (that is, the vectors ite,) are 

*1 -X y , -X;, -X;,X; +1 “ X; , . . . , X* - X; . 

If we subtract the yth column from each of the others, none of the deter- 
minants in (35) are affected, and we obtain columns x I? . .., Xy_j, — x y , 
Xy +1 , \ k . These differ from those of A only in the sign of the yth 
column. Hence (81) holds for this case. 

Suppose next that 0 < i <j < k and that d is obtained from o by 
interchanging p, and p y . Then <r(u) = p 0 + Cu, where C has the same 
columns as A , except that the iih and /th columns have been inter- 
changed. This again implies that (81) holds, since e = — 1. 

The general case follows, since every permutation of {0, 1, k} is 

a composition of the special cases we have just dealt with. 

10.28 Affine chains An affine k-chain T in an open set E c - R n is a collection 
of finitely many oriented affine /:-simplexes . . . , cr r in E. These need not be 
distinct; a simplex may thus occur in T with a certain multiplicity. 

If T is as above, and if co is a /r-form in £, we define 

( 82 ) f © = t f ©• 

We may view a ^-surface in E as a function whose domain is the collec- 
tion of all /r-forms in E and which assigns the number a) to uj. Since real- 
valued functions can be added (as in Definition 4.3), this suggests the use of the 
notation 

(83) T = 4- * * • 4- cr r 

or, more compactly, 

(84) r = t a, 

i = 1 

to state the fact that (82) holds for every /:-form to in E. 

To avoid misunderstanding, we point out explicitly that the notations 
introduced by (83) and (80) have to be handled with care. The point is that 
every oriented affine /:-simplex a in R n is a function in two ways, with different 
domains and different ranges, and that therefore two entirely different operations 
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of addition are possible. Originally, cr was defined as an R n -v alued function 
with domain Q k ; accordingly, cr, + cr 2 could be interpreted to be the function 
c that assigns the vector ^(u) + cr 2 (u) to every u e Q k ; note that o is then again 
an oriented affine /r-simplex in R n \ This is not what is meant by (83). 

For example, if o 2 = — as in (80) (that is to say, if and a 2 have the 
same set of vertices but are oppositely oriented) and if T = + o 2 , then 
J r oj = 0 for all a>, and we may express this by writing T = 0 or c x + o 2 = 0. 
This does not mean that <7j(u) + <r 2 (u) is the null vector of R n . 

10.29 Boundaries For k > 1, the boundary of the oriented affine A:-simplex 

<r= IPo, p,, 

is defined to be the affine ( k — l)-chain 

(85) da= K-iytpo, ...,Pj-i,P;+i. ...,p*]- 

7 = 0 

For example, if c = [p 0 , p 2 ], then 

= [Pi, p 2 ] - [Po, P 2 ] + (Po. Pi] = [Po. Pi] + [Pi, Pi] + [p 2 > Pol. 

which coincides with the usual notion of the oriented boundary of a triangle. 

For 1 < j < k, observe that the simplex Oj = [p 0 , . . . , p i , p y + j, . . . , p*] 
which occurs in (85) has Q k ~ 1 as its parameter domain and that it is defined by 

(86) Oj(u) = p 0 + .Sii (u e Q k “ *). 

where B is the linear mapping from R h ~ l to R n determined by 
B e i = Pi — Po (if 1 < /'<,/' - 1), 

= Pm i - Po ('f j<i <k- I). 

The simplex 

C 0 = [Pi » P2» • * ’ P*]^ 
which also occurs in (85), is given by the mapping 

Mu) = Pi + 

where Be ( = p J + j - Pi for 1 < / < k - 1. 

10.30 Differentiable simplexes and chains Let T be a ^"-mapping of an open 
set E c = R n into an open set V c - R m \T need not be one-to-one. If o is an oriented 
affine /r-bimplex in E, then the composite mapping d) = T o c (which we shall 
sometimes write in the simpler form To) is a /r-surface in V, with parameter 
domain Q k . We call d> an oriented k-simplex of class ( 6" . 
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A finite collection ¥ of oriented /:-simplexes <!>,, <D r of class 9T in V 

is called a k-chain of class V>" in V. If to is a &-form in V, we define 

(87) f co= f [ co 

J i = i * r <D ; 

and use the corresponding notation ¥ = X<D, . 

if r = XcTj is an affine chain and if O, = f ^ a, , we also write ¥ = T o F, 
or 

(88) T(£< Ji)=YTa t . 

The boundary d<D of the oriented ^-simplex O = T c a is defined to be Ihe 
(& — 1) chain 

(89) <3<D = T(dc). 

In justification of (89), observe that if T is affine, then <X> = T a is an 
oriented affine ^-simplex, in which case (89) is not a matter of definition, but is 
seen to be a consequence of (85). Thus (89) generalizes this special case. 

It is immediate that d<D is of class if this is true of d>. 

Finally, we define the boundary d¥ of the A-chain ¥ = X<D, to be the 
(& — 1) chain 

(90) a¥ = x^,. 

10.31 Positively oriented boundaries So far we have associated boundaries to 
chains, not to subsets of R n . This notion of boundary is exactly the one that is 
most suitable for the statement and proof of Stokes’ theorem. However, in 
applications, especially in R 2 or /? 3 , it is customary and convenient to talk 
about “oriented boundaries’’ of certain sets as well. We shall now describe 
this briefly. 

Let Q n be the standard simplex in R n , let o 0 be the identity mapping with 
domain Q n . As we saw in Sec. 10.26, c 0 may be regarded as a positively oriented 
/i-simplex in R n . Its boundary dc 0 is an affine (n — l)-chain. This chain is 
called the positively oriented boundary of the set Q n . 

For example, the positively oriented boundary of Q 3 is 

[«i, e 2 , e 3 ] - [0, e 2 , e 3 ] + [0, e l5 e 3 ] - [0, e 2 ]. 

Now let T be a 1-1 mapping of Q n into R n , of class c 6'\ whose Jacobian is 
positive (at least in the interior of Q n ). Let E = T(Q H ). By the inverse function 
theorem, E is the closure of an open subset of R n . We define the positively 
oriented boundary of the set E to be the (n — l)-chain 

dT=T(do 0 ), 

and we may denote this (n — l)-chain by dE. 
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An obvious question occurs here: If E = T^Q") = T 2 (Q n ), and if both 
T, and T 2 have positive Jacobians, is it true that dT y = dT 2 l That is to say, 
does the equality 

f (JO = f co 

J dTx J dT 2 

hold for every (n — l)-form a>? The answer is yes, but we shall omit the proof. 
(To see an example, compare the end of this section with Exercise 17.) 

One can go further. Let 

Q = E, u • • • u E r , 

where E, = T i (Q n ), each T t has the properties that T had above, and the interiors 
of the sets £, are pairwise disjoint. Then the ( n — l)-chain 

dT { + * • • + dT r = dQ 

is called the positively oriented boundary of Q. 

For example, the unit square / 2 in R 2 is the union of <r { (Q 2 ) and a 2 (Q 2 ), 

where 


<j,(u) = u, (t 2 (u) = e x 4- e 2 - u. 
Both <7j and <r 2 have Jacobian 1 > 0. Since 

o ! = [0, e { , e 2 ], g 2 = [e! + e 2 , e 2 , ej 


we have 


CO , = [e, , e 2 ] - [0, e 2 ] + [0. e,], 

<!o 2 = [e 2 , e,] - [e! + e 2 , e,] + [e, + e 2 , e 2 ]; 

The sum of these two boundaries is 

cl 2 = [0, e,] + [e, , e, + e 2 ] + [e, + e 2 , e 2 ] + [e 2 , 0], 

the positively oriented boundary of I 2 . Note that [e,, e 2 ] canceled [e 2 , ej. 

If O is a 2-surface in R m , with parameter domain 7 2 , then O (regarded as 
a function on 2-forms) is the same as the 2-chain 

O o (7 t + O o o 2 . 

Thus 

do = ^(O o o’,) 4- ^(O o c 2 ) 

= (D (f5a,) + 0(&7 2 ) = 0(^/ 2 ). 

In other words, if the parameter domain of O is the square 7 2 , we need 
not refer back to the simplex Q 2 , but can obtain directly from dl 2 . 

Other examples may be found in Exercises 17 to 19. 
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10.32 Example For 0 < u < /r, 0 < v < 2n, define 

X(w, v) = (sin u cos v , sin u sin v , cos u). 

Then X is a 2-surface in 7? 3 , whose parameter domain is a rectangle D <= R 2 , 
and whose range is the unit sphere in R 3 . Its boundary is 

= Z(<?L>) = y l f y 2 + + 74 

where 


Vi(w) = X(w, 0) = (sin w, 0, cos w), 
y 2 (i>) = Z(*, i>) = (0, 0,-1), 
y 3 (w) = X(7r - i/, 27r) = (sin w, 0, -cos w), 
y 4 (y) = X(0, 2;r - v) = (0, 0, 1), 

with [0, 7r] and [0, 2zr] as parameter intervals for u and r, respectively. 

Since y 2 and y 4 are constant, their derivatives are 0, hence the integral of 
any 1-form over y 2 or y 4 is 0. [See Example 1.12(a).] 

Since y 3 (u) = y, (n — u), direct application of (35) shows that 

f co = — f co 

for every 1-form co. Thus co = 0, and we conclude that dX = 0. 

(In geographic terminology, dX starts at the north pole A r , runs to the 
south pole S along a meridian, pauses at 5, returns to N along the same meridian, 
and finally pauses at N. The two passages along the meridian are in opposite 
directions. The corresponding two line integrals therefore cancel each other. 
In Exercise 32 there is also one curve which occurs twice in the boundary, but 
without cancellation.) 


STOKES’ THEOREM 

10.33 Theorem IfH* is a k-chain of class %” in an open set V < - R m and if co 
is a (k — \ yform of class in V, then 

(91) f dco = f co. 

•'< 54 ' 

The case k = m = 1 is nothing but the fundamental theorem of calculus 
(with an additional differentiability assumption). The case k = m = 2 is Green’s 
theorem, and k = m = 3 gives the so-called “divergence theorem” of Gauss. 
The case k = 2, m = 3 is the one originally discovered by Stokes. (Spivak’s 
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book describes some of the historical background.) These special cases will be 
discussed further at the end of the present chapter. 

Proof It is enough to prove that 


/ d <°=\ 

J<j> 


for every oriented /^-simplex 0 of class in V . For if (92) is proved and 
if V = E0>, , then (87) and (89) imply (91). 

Fix such a Q> and put 

( 7 = [ 0 , e u 

Thus <T is the oriented affine ^-simplex with parameter domain Q k which 
is defined by the identity mapping. Since $ is also defined on Q k (see 
Definition 10.30) and Q> e there is an open set E a R k which contains 
Q k , and there is a ^"-mapping T of E into V such that O = T ° a. By 
Theorems 10.25 and 10.22(c), the left side of (92) is equal to 


dco = I {d(o) T = d(a) T ). 

* T/t Jit 


Another application of Theorem 10.25 shows, by (89), that the right side 
of (92) is 


/ w = / W = J 


Since a) T is a (Jk — l)-form in E , we see that in order to prove (92) 
we merely have to show that 


f dX=[ X 

J a J Hit 


for the special simplex (93) and for every ( k — \)-form X of class W in E. 

If k= 1, the definition of an oriented 0-simplex shows that (94) 
merely asserts that 


J o 


for every continuously differentiable function / on [0, 1], which is true 
by the fundamental theorem of calculus. 

From now on we assume that k > 1, fix an integer r (1 < r < k), 
and choose / e # \E ). It is then enough to prove (94) for the case 

X = /(x) dx x A • • • A dx r _ x A dx r+l A • ■ • A dx k 
since every (k — l)-form is a sum of these special ones, for r = 1 , . . . , k. 


( 96 ) 
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By (85), the boundary of the simplex (93) is 
da= [e„ .... e*] + £(-1 )'t, 

i— 1 

where 

Tt = [0, e u . . . , e,--!, • • • > e*] 

for i = 1, . . . , k. Put 

Tq “ [e r , e 1? . . . , e r _!, e r + 1> • • • > ©*]• 

Note that r 0 is obtained from [e^ . . . , e*] by r — 1 successive interchanges 
of e r and its left neighbors. Thus 

(97) d<x = (-l) , “ 1 T 0 + £(-1 )'t,-- 

i = 1 

Each has Q k ~ l as parameter domain. 

If x = r 0 (u) and ueQ*" 1 , then 

(uj (1 <j<r), 


(98) 

= 1 - (“i 

+ ... 

+ «s-i) 0 = r )> 


k-i 


(r<j<k). 


if l<i<k,ueQ k -\ 

and x 

= t,(u), then 



K 

(1 <7 < 0, 

(99) 

X J = 

0 

0‘ = 0. 



W-i 

(/ <j<k). 


For 0 < i < k, let /,• be the Jacobian of the mapping 

(100) (t/j, . . . , i) — ► (^i> • • • > x r - 1 > x r + 1 > • • • 9 %k) 

induced by i f . When i = 0 and when / = r, (98) and (99) show that (100) 
is the identity mapping. Thus J 0 = 1, J r = 1. For other /, the fact that 
** = 0 in (99) shows that J { has a row of zeros, hence = 0. Thus 

(101) f >1 = 0 (/ # 0, i # r), 

by (35) and (96). Consequently, (97) gives 

f A-c-iy ” 1 f ^ + (-o r f * 

J a<y J T0 •'tr 

= (-ir 1 / [/(t 0 (u))-/(T r (u))]Ju. 


( 102 ) 
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On the other hand, 

dX = ( D r f)(x)dx r a dx x a • • • a dx r _ x a dx r+y a • • • a dx k 
= (-I y-\D r f)(x)dx l A ■■■ A dx k 

so that 


(103) \d?. = (- (D r f)(\)d\. 

We evaluate (103) by first integrating with respect to x r , over the interval 

[0, 1 — (x, + • • • + X r _ , + x r+ , + • • • + X*)]- 

put (.Vj, . . . , x r _ ! , .v r+ i, • . • , x k ) = (Wi, . . . , iZ/c _ i), and see with the aid of 
(98) that the integral over 0* in (103) is equal to the integral over Q k ~ l 
in (102). Thus (94) holds, and the proof is complete. 


CLOSED FORMS AND EXACT FORMS 

10.34 Definition Let be a A -form in an open set E a R n . If there is a (k — 1 )- 
form A in E such that co = dX, then to is said to be exact in E. 

If to is of class < 6' and da> = 0, then co is said to be closed. 

Theorem 10.20(6) shows that every exact form of class %' is closed. 

In certain sets £, for example in convex ones, the converse is true; this 
is the content of Theorem 10.39 (usually known as Poincare's lemma) and 
Theorem 10.40. However. Examples 10.36 and 10.37 will exhibit closed forms 
that are not exact. 

10.35 Remarks 

(a) Whether a given &-form co is or is not closed can be verified by 
simply differentiating the coefficients in the standard presentation of co. 
For example, a 1-form 

(104) w=Y^f i (x)dx i , 

i — 1 

with /. e %>\E) for some open set E c R n , is closed if and only if the 
equations 

(105) (DjMx) = (DJj)(x) 
hold for all i,j in (1 , ...,«} and for all x e E. 
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Note that (105) is a “pointwise” condition; it does not involve any 
global properties that depend on the shape of £. 

On the other hand, to show that co is exact in £, one has to prove 
the existence of a form A, defined in E , such that dk = co. This amounts 
to solving a system of partial differential equations, not just locally, but 
in all of E. For example, to show that (104) is exact in a set £, one has 
to find a function (or 0-form) g e %'( E ) such that 

(106) (D i g)(x)=f i (x) (x e £, 1 < / < n). 

Of course, (105) is a necessary condition for the solvability of (106). 

( b ) Let co be an exact k-i orm in E. Then there is a (k — l)-form A in £ 
with dk = co, and Stokes’ theorem asserts that 


(107) 


I^tt> = J 

II 


4' 'W 


for every k - chain of class in £. 

If 4 / 1 and 4^ are such chains, and if they have the same boundaries, 
it follows that 


co = 


s 

J v 2 


CO. 


In particular, the integral of an exact k-form in E is 0 over every 
k-chain in E whose boundary is 0. 

As an important special case of this, note that integrals of exact 
1-forms in £ are 0 over closed (differentiable) curves in £. 

(c) Let co be a closed k-f orm in £. Then dco = 0, and Stokes’ theorem 
asserts that 


(108) co = c/co = 0 

J |/ 

for every ( k + l)-chain ^ of class in £. 

In other words, integrals of closed k-f or ms in E are 0 over k-chains 
that are boundaries of(k+ 1 )-chains in £. 

(d) Let ^ be a (H l)-chain in £ and let A be a (k — l)-form in £, both 
of class Since d 2 k = 0, two applications of Stokes’ theorem show that 

(109) f A= f </A= f d 2 k = 0. 

Jedy ^ 4 ' 

We conclude that d 2 V = 0. In other words, the boundary of a 
boundary is 0. 

See Exercise 16 for a more direct proof of this. 
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10.36 Example Let E = R 2 — {0}, the plane with the origin removed. The 
1-form 


( 110 ) 

is closed in R 2 
define 


x dy — y dx 
^ x 2 + y 2 

{0}. This is easily verified by differentiation. Fix r > 0, and 


(111) y(t) = (r cos /, r sin t ) (0 < t < 2 n). 

Then y is a curve (an “oriented 1 -simplex”) in R 2 —(0). Since y(0) = y(2n), 
we have 

(112) dy = 0. 

Direct computation shows that 


(113) f r] = 2 k # 0. 

The discussion in Remarks 10.35(£) and (c) shows that we can draw two 
conclusions from (113): 

First, r] is not exact in R 2 — {0}, for otherwise (112) would force the integral 

(113) to be 0. 

Secondly, y is not the boundary of any 2-chain in R 1 — {0} (of class %"), 
for otherwise the fact that ^ is closed would force the integral (1 13) to be 0. 


10.37 Example Let E = R 3 — (0), 3-space with the origin removed. Define 


(114) 


x dy a dz + y dz a dx 4 - z dx A dy 
(x 2 + y 2 + z 2 ) 3/2 


where we have written (x, y\ z) in place of (xj , x 2 , x 3 ). Differentiation shows 
that dC = 0, so that C is a closed 2-form in R 3 — {0}. 

Let I be the 2-chain in R 3 — (0} that was constructed in Example 10.32; 
recall that I is a parametrization of the unit sphere in R 3 . Using the rectangle 
D of Example 10.32 as parameter domain, it is easy to compute that 


(115) ( = sin u du dv = 4/r # 0. 

J z j d 

As in the preceding example, we can now conclude that ( is not exact in 
R 3 — { 0 } (since dl = 0, as was shown in Example 10.32) and that the sphere I 
is not the boundary of any 3-chain in R 3 — { 0 } (of class #"), although dl = 0. 
The following result will be used in the proof of Theorem 10.39. 
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10.38 Theorem Suppose E is a convex open set in R n ,f e %>'{ E ), p is an integer , 

1 < p < n, and 

(116) (Djf)(\) — 0 (p < j < n, x e E). 

Then there exists an F e W(E) such that 

(117) (D p F)(x) =/(x), (DjF)(x) = 0 ( p<j<n,xeE ). 

Proof Write x = (x\ x p , x"), where 

x' = (*i> ...,x p _,), x'' = (x p+1 , ...,*„). 

(When p= 1, x' is absent; when p = n, x" is absent.) Let L be the 
set of all (x\x p )eR p such that (x', x p , x") g £ for some x". Being a 
projection of £, K is a convex open set in £ p . Since E is convex and (116) 
holds, fix) does not depend on x". Hence there is a function </>, with 
domain V , such that 

fix) = </>(x\ x p ) 

for all xg£ 

If p = 1, K is a segment in £' (possibly unbounded). Pick ceV 
and define 

(pit ) dt (x e £). 

If p > 1, let U be the set of all x' e R p ~ l such that (x\ x p ) e V for 
some x p . Then U is a convex open set in R p ~\ and there is a function 
(xe^'iU) such that (x', a(x')) e V for every x' e U\ in other words, the 
graph of a lies in V (Exercise 29). Define 

Fix) = f <p(x', t) dt (x g £). 

In either case, £ satisfies (117). 

iNote: Recall the usual convention that J£ means — j£ if b < a.) 

10.39 Theorem IfEc^RP is convex and open , if k > 1 , if a> is a k-form of 
class W in £, and if dco = 0, then there is a ik — 1 )-form X in E such that a> = dk. 

Briefly, closed forms are exact in convex sets. 

Proof For p = 1, let Y p denote the set of all /:-forms co, of class 

<€' in £, whose standard presentation 

(1 18) co = I//x) dxj 

I 

does not involve dx p + 1 , . . . , dx n . In other words, / <= { 1 , ...,/?} if/ r (x) # 0 
for some x g £. 


F(x ) = J 
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We shall proceed by induction on p. 

Assume first that cue T t . Then a> =f(\) dx x . Since dco = 0, 
( Djf )(\ ) = 0 for 1 <j<n,\eE. By Theorem 10.38 there is an F e \E ) 
such that D X F =/ and DjF = 0 for 1 <j < n. Thus 

dF = (DjFXx) dx x = /(x) dx x = a>. 

Now we take p > 1 and make the following induction hypothesis: 
Every closed k-form that belongs to Y p -\ is exact in E. 

Choose c oe Y p so that doj = 0. By (1 18), 

(119) £ X ( D jfi)( x ) dXj a dx, = d(o = 0. 

/ j=i 

Consider a fixed y, with p <j <n. Each / that occurs in (118) lies in 
{1, ...,/?}. If /j, / 2 are two of these ^-indices, and if /j #/ 2 , then the 
(k + l)-indices (/ lv /), (/ 2 , ./) are distinct. Thus there is no cancellation, 
and we conclude from (119) that every coefficient in (118) satisfies 

(120) (Djf j){\) = 0 (xe E,p <j<n). 

We now gather those terms in (1 18) that contain dx p and rewrite a> 
in the form 


(121) a) = a + £/,(x)rfx /o A dx „, 

1 0 

where <x e T p _ , , each I 0 is an increasing (k — l)-index in (1, ..., p — 1}, 
and I = (l 0 ,p). By (120), Theorem 10.38 furnishes functions F I e c ^\E) 
such that 

022) D p F I = // , DjF t = 0 ( P<j<n ). 

Put 

0 23) M^(x)^ 0 

/ o 

and define y = co — (— 1)* _1 dp. Since p is a (& — l)-form, it follows that 

y = w - Z X (£/ ^/X*) ^/o A dx) 

/o ; = i 

= «- ! I (DjF ; )(x) a 

/ 0 7=1 

which is clearly in Since dcj = 0 and d 2 p = 0, we have dy = 0. 

Our induction hypothesis shows therefore that y = dp for some 
(k — l)-form p in E. If 2 = p + (— \ ) k ~ 1 p, we conclude that co = dX. 

By induction, this completes the proof. 
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10.40 Theorem Fix k, 1 < k < n. Let E c R n be an open set in which every 
closed k-form is exact. Let T be a 1-1 W -mapping of E onto an open set U c= R n 
whose inverse S is also of class ( €" . 

Then every closed k-form in U is exact in U . 

Note that every convex open set E satisfies the present hypothesis, by 
Theorem 10.39. The relation between E and U may be expressed by saying 
that they are ^"-equivalent. 

Thus every closed form is exact in any set which is ( €" -equivalent to a convex 
open set. 

Proof Let w be a /:-form in U , with dco = 0. By Theorem 10.22(c), 

co T is a /v-form in E for which d(a) T ) = 0. Hence c o T = dk for some 

(k — l)-form A in E. By Theorem 10.23, and another application of 

Theorem 10.22(c), 

(D = (u) T ) s = (dk) s = d(k s ). 

Since k s is a (k — l)-form in U 9 co is exact in U. 

10.41 Remark In applications, cells (see Definition 2.17) are often more con- 
venient parameter domains than simplexes. If our whole development had 
been based on cells rather than simplexes, the computation that occurs in the 
proof of Stokes’ theorem would be even simpler. (It is done that way in Spivak’s 
book.) The reason for preferring simplexes is that the definition of the boundary 
of an oriented simplex seems easier and more natural than is the case for a cell. 
(See Exercise 19.) Also, the partitioning of sets into simplexes (called “triangu- 
lation”) plays an important role in topology, and there are strong connections 
between certain aspects of topology, on the one hand, and differential forms, 
on the other. These are hinted at in Sec. 10.35. The book by Singer and Thorpe 
contains a good introduction to this topic. 

Since every cell can be triangulated, we may regard it as a chain. For 
dimension 2, this was done in Example 10.32; for dimension 3, see Exercise 18. 

Poincare’s lemma (Theorem 10.39) can be proved in several ways. See, 
for example, page 94 in Spivak’s book, or page 280 in Fleming’s. Two simple 
proofs for certain special cases are indicated in Exercises 24 and 27. 


VECTOR ANALYSIS 

We conclude this chapter with a few applications of the preceding material to 
theorems concerning vector analysis in R 3 . These are special cases of theorems 
about differential forms, but are usually stated in different terminology. We 
are thus faced with the job of translating from one language to another. 
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10.42 Vector fields Let F = /^ej + £ 2 e 2 + F 3 e 3 be a continuous mapping of 
an open set E c= R 3 into R 3 . Since F associates a vector to each point of £, F 
is sometimes called a vector field, especially in physics. With every such F is 
associated a 1-form 

(124) / F = F x dx 4 F 2 dy 4 F 3 dz 
and a 2-form 

(125) oj f = F x dy a dz 4 F 2 dz a dx 4 F 3 dx a dy. 

Here, and in the rest of this chapter, we use the customary notation (jc, y, z) 
in place of (x Jt x 2 , x 3 ). 

It is clear, conversely, that every 1-form / in £ is / F for some vector field 
F in £, and that every 2-form co is co F for some F. In £ 3 , the study of 1-forms 
and 2-forms is thus coextensive with the study of vector fields. 

If u e #'(£) is a real function, then its gradient 

Vw = (£,w)e 1 + (D 2 w)e 2 4- ( D 3 u)c 3 

is an example of a vector field in £. 

Suppose now that F is a vector field in £, of class Its curl V x F is the 
vector field defined in £ by 

V x F = (D 2 F 3 - D 3 F 2 )e x 4 (D 3 F X - D x F 3 )e 2 4 (D X F 2 - D 2 F x )e 3 
and its divergence is the real function V • F defined in £ by 
V-F =D X F X 4 £> 2 £ 2 4 £> 3 ^ 3 . 

These quantities have various physical interpretations. We refer to the 
book by O. D. Kellogg for more details. 

Here are some relations between gradients, curls, and divergences. 

10.43 Theorem Suppose E is an open set in £ 3 , u e #"(£), and G is a vector 
field in £, of class C". 

(a) If F = Vw, then V x F = 0. 

(b) If F = V x G, then V • F = 0. 

Furthermore , if E is W- equivalent to a convex set , then (a) and ( b ) have 
converses , in which we assume that F is a vector field in £, of class : 

(a') //'V x F = 0, then F = Vw for some u e #"(£). 

(b f ) IfV * F = 0, then F = V x G for some vector field G in £, of class *6" . 

Proof If we compare the definitions of Vw, V x F, and V • F with the 
differential forms 2 F and a) F given by (124) and (125), we obtain the 
following four statements: 
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*1 

II 

< 

S! 

if and only if 

A p — du . 

V X F = 0 

if and only if 

dk f =0. 

F = V x G 

if and only if 

o 

ii 

u. 

3 

o 

II 

> 

if and only if 

d(o F = 0. 


Now if F = Vw, then A F = du, hence dX ¥ = d 2 u = 0 (Theorem 10.20), 
which means that V x F = 0. Thus (a) is proved. 

As regards (< a '), the hypothesis amounts to saying that dX ¥ = 0 in E. 
By Theorem 10.40, 2 F = du for some 0-form u. Hence F = Vu. 

The proofs of ( b ) and ( b ') follow exactly the same pattern. 


10.44 Volume elements The /c-form 

dx x a • • • a dx k 

is called the volume element in R k . It is often denoted by dV (or by dV k if it 
seems desirable to indicate the dimension explicitly), and the notation 

(126) f /(x) dxj a ••• a dx k = f fdV 

is used when d> is a positively oriented /c-surface in R k and / is a continuous 
function on the range of O. 

The reason for using this terminology is very simple: If D is a parameter 
domain in R k , and if d> is a 1-1 ^'-mapping of D into R k , with positive Jacobian 
«/(j) , then the left side of (126) is 

[ /(0(u))/*(u) du = f fix) dx, 

J D J <D(D) 

by (35) and Theorem 10.9. 

In particular, when /= 1, (126) defines the volume of O. We already saw 
a special case of this in (36). 

The usual notation for dV 2 is dA. 


10.45 Green’s theorem Suppose E is an open set in R 2 , a e #'(£)> P 6 #'(£)» 
and Q is a closed subset of £, with positively oriented boundary <3Q, as described 
in Sec. 10 . 31 . Then 


(127) 


L ( * dx+ -L(¥*- e £} dA - 



INTEGRATION OF DIFFERENTIAL FORMS 283 


Proof Put X = a dx + /? dy. Then 

dX = (Z) 2 a) dy a dx + (D { p) dx a dy 
= (D 1 P — D 2 a) dA, 
and (127) is the same as 



dX, 


which is true by Theorem 10.33. 


With x(x, y) = — y and p(x, y) = x, (127) becomes 
(128) if (a- dy -ydx) = A(Q), 

the area of Q. 

With a = 0, p = a , a similar formula is obtained. Example 10.12(6) con- 
tains a special case of this. 


10.46 Area elements in R 3 Let $ be a 2-surface in R 3 , of class with pa- 
rameter domain D a R 2 . Associate with each point (w, v) e D the vector 


(129) 


N(w, v) = 


c{}\ z) 
£(u, r) e ’ 


d(z, x ) d(x, y) 

3(u,v) C2 8(u,v) Ci ' 


The Jacobians in (129) correspond to the equation 


(130) 


(.v, y, z) = <D(*v, v). 


If /is a continuous function on <!>(/)), the area integral of / over <I> is 
defined to be 


(131) f fdA = f /(<D(t/, r))|N(w, v)\ du dv. 

- D 

In particular, when / = 1 we obtain the area of Q>, namely, 

(132) /4(4>) = f |N(w, v)\ dudv. 

J D 

The following discussion will show that (131) and its special case (132) 
are reasonable definitions. It will also describe the geometric features of the 
vector N. 

Write ^ = (p l e 1 -h cp 2 e 2 -h cp 3 e 3l fix a point p 0 = (w 0 , v 0 ) e D, put 
N = N(p 0 ), put 

(133) a, = (A<Pi)(Po)> Pi = (D 2 <Pi)(po) (/ = 1, 2, 3) 
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and let T e L(R 2 , R 3 ) be the linear transformation given by 

(134) T(u,»)=i(« t u + p t v}e t . 

i = 1 

Note that T = <P'(Po)> in accordance with Definition 9.1 1. 

Let us now assume that the rank of T is 2. (If it is 1 or 0, then N = 0, and 
the tangent plane mentioned below degenerates to a line or to a point.) The 
range of the affine mapping 

(w, v) -► <P(p 0 ) + T(w, v) 

is then a plane n, called the tangent plane to O at p 0 . [One would like to call 
n the tangent plane at <J>(p 0 ), rather than at p 0 ; if d> is not one-to-one, this runs 
into difficulties.] 

If we use (133) in (129), we obtain 

(135) N = (a 2 /? 3 - a 3 /? 2 )e, + (a 3 /?i ~ *\Pz)e 2 + ~ XiPx)*^ 

and (134) shows that 

(136) 7e, = £«,*„ Tt 2 = X Pi*i- 

1=1 1=1 

A straightforward computation now leads to 

(137) N • (7e j ) = 0 = N • (7e 2 ). 

Hence N is perpendicular to n. It is therefore called the normal to <I> at p 0 . 

A second property of N, also verified by a direct computation based on 
(135) and (136), is that the determinant of the linear transformation of /? 3 that 
takes {e,, e 2 , e 3 ) to {Te^ 7e 2 , N] is |N | 2 > 0 (Exercise 30). The 3-simplex 

(138) [0, Tej, 7e 2 , N] 
is thus positively oriented. 

The third property of N that we shall use is a consequence of the first two: 
The above-mentioned determinant, whose value is |N| 2 , is the volume of the 
parallelepiped with edges [0, 7e,], [0, 7e 2 ], [0, N]. By (137), [0, N] is perpen- 
dicular to the other two edges. The area of the parallelogram with vertices 

( 1 39) 0,7^, 7e 2 , T(e 1 + e 2 ) 
is therefore | N | . 

This parallelogram is the image under T of the unit square in R 2 . If E 
is any rectangle in R 2 , it follows (by the linearity of T) that the area of the 
parallelogram T(E) is 

A(T(E)) = | N | A(E) = f |N (u 09 v 0 )\ du dv. 

J E 


( 140 ) 
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We conclude that (132) is correct when is affine. To justify the definition 
(132) in the general case, divide D into small rectangles, pick a point (w 0 , v 0 ) 
in each, and replace O in each rectangle by the corresponding tangent plane. 
The sum of the areas of the resulting parallelograms, obtained via (140), is then 
an approximation to ^4(0). Finally, one can justify (131) from (132) by approxi- 
mating/by step functions. 


10.47 Example Let 0 < a < b be fixed. Let K be the 3-cell determined by 
0 < t < a, 0<u<2n, 0 < v < 2n. 

The equations 


X = t cos u 

(141) y = (b + t sin u) cos v 

z = (b 4 - 1 sin u) sin v 

describe a mapping ¥ of R 3 into R 3 which is 1-1 in the interior of K, such that 
¥(.£) is a solid torus. Its Jacobian is 


_ djx, y, z) 
* 5(t, u, v) 


= t(b + t sin u) 


which is positive on K , except on the face t = 0. If we integrate Jy over K, we 
obtain 


vol ( X F( K )) = 2n 2 a 2 b 
as the volume of our solid torus. 

Now consider the 2-chain <D = d ¥. (See Exercise 19.) ¥ maps the faces 
u = 0 and u = 2n of K onto the same cylindrical strip, but with opposite orienta- 
tions. ¥ maps the faces v = 0 and v = 2n onto the same circular disc, but with 
opposite orientations. ¥ maps the face t = 0 onto a circle, which contributes 0 
to the 2-chain 8 (The relevant Jacobians are 0.) Thus d> is simply the 2-surface 
obtained by setting t = a in (141), with parameter domain D the square defined 
by 0 < u < 27t, 0 < v < 2n. 

According to (129) and (141), the normal to O at (w, v) e D is thus the 
vector 


N(w, v) = a(b 4- a sin w)n(w, v) 


where 


n(w, v) = (cos «)e x + (sin u cos v)e 2 + (sin u sin r)e 3 . 
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Since |n(w, v) \ = 1, we have |N(w, v)\ = a(b 4- a sin w). and if we integrate this 
over Z), (131) gives 

>f(<J>) = 4n 2 ab 

as the surface area of our torus. 

If we think of N = N (u, v) as a directed line segment, pointing from 
0(t/, v) to d>(w, v) -f N(//, v), then N points outward, that is to say, away from 
4*( K ). This is so because J T > 0 when t = a. 

For example, take u = v = tt/2, t = a. This gives the largest value of z on 
and N = a(b 4 a)e 3 points “upward'’ for this choice of ( u , ?;). 

10.48 Integrals of 1-forms in R 3 Let y be a r 6 ''-curve in an open set E cz A 3 , 
with parameter interval [0, 1], let F be a vector field in £, as in Sec. 10.42, and 
define X F by (124). The integral of X F over y can be rewritten in a certain way 
which we now describe. 

For any u e [0, 1], 

y\u) = y/OOej 4 y{ (w)e 2 4 yi(w)e i 

is called the tangent vector to y at u. We define t = t(//) to be the unit vector in 
the direction of y'(u). Thus 

/(«)= lv'(w)|t(M). 

[If y\u) = 0 for some u, put t(u) = e, ; any other choice would do just as well.] 
By (35), 

[ = I [ F,(y(i 0)y'M <iu 

(142) =[ F(y(i/)) • y\u) du 

J o 

= f ' F(>’(m)) • t(w)| y'(w)| du. 

Theorem 6.27 makes it reasonable to call |/(w)| du the element of arc 
length along y. A customary notation for it is ds , and (142) is rewritten in the 
form 

(143) [ A f = [ (F • t) ds. 

J y J y 

Since t is a unit tangent vector to y, F • t is called the tangential component 
of F along y. 
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The right side of (143) should be regarded as just an abbreviation for the 
last integral in (142). The point is that F is defined on the range of y, but t is 
defined on [0, 1 ]; thus F • t has to be properly interpreted. Of course, when 7 
is one-to-one, then t (u) can be replaced by t(y(w)), and this difficulty disappears. 


10.49 Integrals of 2-forms in R 3 Let O be a 2-surface in an open set E c R 3 , 
of class with parameter domain D c R 2 . Let F be a vector field in E , and 
define a> ¥ by (125). As in the preceding section, we shall obtain a different 
representation of the integral of a> ¥ over <D. 

By (35) and (129), 


f co F = f (Fi dy a dz 4- F 2 dz a dx + F 3 dx a dy) 

•'<!> J<t> 

-S t(F,.*)p4+ 

M <90, v) 1 d(u,v) K 3 d(u,v)l 

= f F(0(w, v)) • NO, v) du dv. 

J n 


Now let n = n (w, v) be the unit vector in the direction of N(w, v). [If 
N(w, v) = 0 for some (w, v) e D, take n(w, v ) = e t .] Then N = |N|n, and there- 
fore the last integral becomes 


F(0(w, v)) • n(w, v) | N(w, v ) | du dv. 

By (131), we can finally write this in the form 

(144) f <o ¥ =( (F • n) dA. 

With regard to the meaning of F • n, the remark made at the end of Sec. 10.48 
applies here as well. 

We can now state the original form of Stokes’ theorem. 


10.50 Stokes’ formula If ¥ is a vector field of class <€' in an open set E c: R 3 , 
and if Q is a 2-surface of class in E, then 

(145) f (V x F) • n tfU = f (F • t )ds. 


Proof Put H = V x F. Then, as in the proof of Theorem 10.43, we have 

(146) w H = dX ¥ . 
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Hence 

f (VxF)*n^=f (H • n) dA = f o> H 

= f^r=f A r =f (F • t) ds. 

Here we used the definition of H, then (144) with H in place of F, 
then (146), then — the main step — Theorem 10.33, and finally (143), 
extended in the obvious way from curves to 1 -chains. 

10.51 The divergence theorem IfF is a vector field of class in an open set 
E c R 3 , and if Q is a closed subset of E with positively oriented boundary dQ. 
(as described in Sec. 10.31) then 

(147) f(V-F )dV=[ (Fn )dA. 

J n 

Proof By (125), 

da) r = (V • F) dx a dy a dz = (V • F) dV. 

Hence 

f (V-F)dV= f d(o F = f ( o F = [ (F*n )dA, 
by Theorem 10.33, applied to the 2-form co F , and (144). 


EXERCISES 

1. Let H be a compact convex set in R k t with nonempty interior. Let / e #(//), put 
/(x) = 0 in the complement of //, and define J„/ as in Definition 10.3. 

Prove that J»//is independent of the order in which the k integrations are 
carried out. 

Hint: Approximate / by functions that are continuous on R k and whose 
supports are in //, as was done in Example 10.4. 

2. For / = 1, 2, 3, . . . , let (p t e <6{R l ) have support in (2“', 2 1 _t ), such that J<p< = 1. 
Put 

00 

fix, >’) = X - <pi + i(.'f)]9iOO 

1= I 

Then /has compact support in R 2 ,f is continuous except at (0, 0), and 
J dy J/(j c, y) dx= 0 but j dx jf(x, y)dy= 1 . 

Observe that /is unbounded in every neighborhood of (0, 0). 
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3. (a) If F is as in Theorem 10.7, put A = F'(0), Fj(x) = A _1 F(x). Then Fi(0) = I. 
Show that 

Fi(x) = G„ © G,,_i © • • • o Gi(x) 

in some neighborhood of 0, for certain primitive mappings G,,...,G n . This 
gives another version of Theorem 10.7: 

F(x) — F / (0)G n o G n _i © • • • o Gi(x). 

( b ) Prove that the mapping (jc, y) -> 0\ x) of R 2 onto R 2 is not the composition 
of any two primitive mappings, in any neighborhood of the origin. (This shows 
that the flips B ( cannot be omitted from the statement of Theorem 10.7.) 

4. For (jc, y ) e /? 2 , define 

F(.v, y) = (e x cos y — 1 , e* sin y). 

Prove that F ~ G 2 » Gj, where 

G,(*, y)= (e x cos y- \ ,y) 

G 2 (i/, y) = (//, (1 F //) tan v) 

are primitive in some neighborhood of (0, 0). 

Compute the Jacobians of G,, G 2 , F at (0, 0). Define 

H 2 U, y) --= (x, e x sin y) 

and find 

Hi(//, v) = (/#(//, v), v) 

so that F = Hi o H 2 is some neighborhood of (0, 0). 

5. Formulate and prove an analogue of Theorem 10.8, in which K is a compact 
subset of an arbitrary metric space. (Replace the functions <p, that occur in the 
proof of Theorem 10.8 by functions of the type constructed in Exercise 22 of 
Chap. 4.) 

6. Strengthen the conclusion of Theorem 10.8 by showing that the functions can 
be made differentiable, and even infinitely differentiable. (Use Exercise 1 of 
Chap. 8 in the construction of the auxiliary functions <?, .) 

7. (a) Show that the simplex Q k is the smallest convex subset of R k that contains 

0, e, e k . 

( b ) Show that affine mappings take convex sets to convex sets. 

8. Let H be the parallelogram in R 2 whose vertices are (1, 1 ), (3, 2), (4, 5), (2, 4). 
Find the affine map T which sends (0, 0) to (1, 1), (1,0) to (3, 2), (0, 1 ) to (2, 4). 
Show that J T = 5. Use T to convert the integral 

a = I c x ~ 9 dx dy 

J H 

to an integral over / 2 and thus compute a. 
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9. Define ( x , y) = T(r , 9) on the rectangle 

0 <r<a, 0 <9 <2n 

by the equations 

x — r cos 9 , y — r sin 9. 

Show that T maps this rectangle onto the closed disc D with center at (0, 0) and 
radius a , that T is one-to-one in the interior of the rectangle, and that J T (r, 9) == r. 
If f e #(/)), prove the formula for integration in polar coordinates: 


f /(*, y) dxdy= f f /( T(r, 0))r dr dO. 

* D •'O 

Hint: Let D 0 be the interior of Z), minus the interval from (0, 0) to (0, a). 
As it stands, Theorem 10.9 applies to continuous functions / whose support lies in 
D 0 . To remove this restriction, proceed as in Example 10.4. 

10. Let a->oo in Exercise 9 and prove that 



dx dy = f f /( T(r, 0))r dr dO , 
•'o •'o 


for continuous functions / that decrease sufficiently rapidly as\x\ \y \ ->x. 
(Find a more precise formulation.) Apply this to 


f(x,y) = exp(-x 2 - y 2 ) 


to derive formula (101) of Chap. 8. 

11. Define (w, v) = T(s, t) on the strip 

0 < 5 < cc, 0 < / < 1 

by setting u = s — 57, v = st. Show that T is a 1-1 mapping of the strip onto the 
positive quadrant Q in R 2 . Show that J T (s , t) = s. 

For x > 0, y > 0, integrate 

u x ~ l e~ u v y ~ l e~ v 


over Q , use Theorem 10.9 to convert the integral to one over the strip, and derive 
formula (96) of Chap. 8 in this way. 

(For this application, Theorem 10.9 has to be extended so as to cover certain 
improper integrals. Provide this extension.) 

12. Let I k be the set of all u = (u u . . . , u k ) e R k with 0 <; u t < 1 for all /; let Q k be the 
set of all x ---= (xi f . . . , x k ) e R k with x { >0, <1. (/* is the unit cube; Q k is 

the standard simplex in R k .) Define x = T( u) by 


Xi = Wi 

x 2 == (1 - U^lti 


X k = (1 — Mi) •••(!— M k ..i)M* . 
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Show that 

z x ‘ = i - n o - «<)• 

i = i <=i 

Show that T maps I k onto Q k , that T is 1-1 in the interior of /*, and that its 
inverse 5 is defined in the interior of Q k by w, = x t and 


Xi 

Ul = 

1 — Xi — X t -i 

for / = 2, . . . , A. Show that 

J T (u) = (1 - ii,)‘-'(i - U 2 ) k ~ 2 ** * (1 - u k . i), 

and 

^s(x) = [(l -*i)(l - x, - x 2 ) ••• (1 - Xi - * * * — at* _ i )] “ 1 . 

13. Let r r k be nonnegative integers, and prove that 


‘ dx — 


ril • r k ! 


•'Qk 


(A + n + •+r k )! 


///>ir; Use Exercise 12, Theorems 10.9 and 8.20. 

Note that the special case r t = • • • = r k = 0 shows that the volume of Q k 
is 1/A:!. 

14. Prove formula (46). 

15. If a ) and A are k- and /?i-forms, respectively, prove that 


oa A A=(-1) 4 "A Aw. 


16. If k > 2 and a = [p 0 , pi, . . . , # p k ] is an oriented affine A-simplex, prove that d 2 o = 0, 
directly from the definition of the boundary operator d. Deduce from this that 
d 2x Y = 0 for every chain T. 

Hint : For orientation, do it first for k = 2, k = 3. In general, if i < /, let o tj 
be the (A — 2)-simplex obtained by deleting p, and p j from a. Show that each cr^ 
occurs twice in d 2 a , with opposite sign. 

17. Put J 2 = tx + t 2 , where 

ti = [0, e,, e, e 2 ], r 2 = — [0, e 2 , e 2 + e,]. 

Explain why it is reasonable to call 7 2 the positively oriented unit square in R~. 
Show that dj 2 is the sum of 4 oriented affine 1-simplexes. Find these. What is 
d( Tl — r 2 )? 

18. Consider the oriented affine 3-simplex 


^1 = [0, Ci, ei + e 2 , e, + e 2 + e 3 ] 

in R 3 . Show that (regarded as a linear transformation) has determinant 1. 
Thus at is positively oriented. 
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Let a be five other oriented 3-simplexes, obtained as follows: 

There are five permutations (i u ii , / 3 ) of (1, 2, 3), distinct from (1, 2, 3). Associate 
with each (i u i 2 , h) the simplex 

s(ii> i 2 , / 3 )[0, e^, + e< 2 , e tl + e, 2 + e, 3 ] 

where s is the sign that occurs in the definition of the determinant. (This is how r 2 
was obtained from ri in Exercise 17.) 

Show that c 2 y ... i <^6 are positively oriented. 

Put P = <7i + o 6 . Then P may be called the positively oriented unit 

cube in R 3 . 

Show that dP is the sum of 12 oriented affine 2-simplexes. (These 12 tri- 
angles cover the surface of the unit cube P.) 

Show that x = (x u x 2i x 3 ) is in the range of oi if and only if 0<,x 3 <,x 2 
^*1^1. 

Show that the ranges of a u ..., a 6 have disjoint interiors, and that their 
union covers P. (Compare with Exercise 13 ; note that 3 ! = 6.) 

19. Let J 2 and J 3 be as in Exercise 17 and 18. Define 


Bo+u, v) = (0, w, v), 3i i (u, v) = (1 , «, v), 

B 02 (u, v ) = (m, 0, v), Bn(u , v) = (u, 1, v ), 

Botiu, v ) = («, v , 0), B 13 (u, v ) = (w, v , 1). 

These are affine, and map R 1 into R 3 . 

Put p ri = Bri(J 2 ), for r = 0, 1, i = 1, 2, 3. Each j3 ft is an affine-oriented 
2-chain. (See Sec. 10.30.) Verify that 

in agreement with Exercise 18. 

20. State conditions under which the formula 

f fdu) = f /a>- f (d/) Ao> 

J ^ J 3<P J o 

is valid, and show that it generalizes the formula for integration by parts. 

Hint: d(fcj) = (df) A +fdw. 

21. As in Example 10.36, consider the 1-form 

xdy — ydx 


in .R 2 - {0}. 

(a) Carry out the computation that leads to formula (113), and prove that drj = 0. 

(b) Let y(t) = (r cos f, r sin t), for some r > 0, and let T be a #"-curve in i* 2 — {0}, 
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with parameter interval [0, In], with T(0) = r(27r), such that the intervals [y(/), 
r(/)] do not contain 0 for any t e [0, 2n]. Prove that 

f ’> = 2 -. 

J r 

Hint : For 0 < t < 2tt, 0 < u < 1, define 

0>(/, u) = (1 — u) T(t) + uy{t). 

Then O is a 2-surface in R 2 — {0} whose parameter domain is the indicated rect- 
angle. Because of cancellations (as in Example 10.32), 

£0 = r — y. 

Use Stokes’ theorem to deduce that 



V 


because drj = 0. 

(c) Take T(t) = ( a cos /, b sin t) where a > 0, b > 0 are fixed. Use part ( b ) to 
show that 

r2* ab 

J 0 a 2 cos 2 1 + b 1 sin 2 ~ 

(d) Show that 


7j = dy arc tan “j 

in any convex open set in which x =£ 0, and that 

rj = d{ — arc tan^ 

in any convex open set in which y ^ 0. 

Explain why this justifies the notation rj — dO , in spite of the fact that r ; is 
not exact in R 2 — {0}. 

(e) Show that (b) can be derived from (d). 

(/) If T is any closed ^ '-curve in R 2 — {0}, prove that 

l/pi-W). 

(See Exercise 23 of Chap. 8 for the definition of the index of a curve.) 



294 PRINCIPLES OF MATHEMATICAL ANALYSIS 


22. As in Example 10.37, define £ in R 3 — {0} by 

Y _x dy A dz 4- y dz A dx 4- z dx A dy 

7 3 

where r = (x 2 4- y 2 4- z 2 ) 1/2 , let D be the rectangle given by 0 < // < n, 0 < v < 2n y 
and let 2 be the 2-surface in R 3 , with parameter domain D, given by 

x = sin u cos v, y — sin u sin i\ z = cos //. 

(a) Prove that d£ = 0 in R 3 — {0}. 

(b) Let S denote the restriction of 2 to a parameter domain E c D. Prove that 

J £ = J sin u du dv — A{S) y 

where A denotes area, as in Sec. 10.43. Note that this contains (1 15) as a special 
case. 

(c) Suppose g y h ly h ly h* , are ^"-functions on [0, 1], g > 0. Let (*, y\ z) = t) 
define a 2-surface O, with parameter domain 7 2 , by 

x = g(t)ht(s), y = g(t)h 2 (s) y z = g(t)/h(s). 

Prove that 



directly from (35). 

Note the shape of the range of O: For fixed 5, ( I>(a\ t) runs over an interval 
on a line through 0. The range of thus lies in a “cone" with vertex at the origin. 

id) Let E be a closed rectangle in Z), with edges parallel to those of D. Suppose 
f e #"(£>),/> 0. Let O be the 2-surface with parameter domain £, defined by 

V ) = /( //} V ) v ( Wj y ). 

Define S as in ( b ) and prove that 

f £= f l=A (S). 

* ft J S 

(Since S is the “radial projection” of O into the unit sphere, this result makes it 
reasonable to call J n £ the “solid angle” subtended by the range of 12 at the origin.) 
Hint: Consider the 3-surface T given by 

Tf/, m, v) = [1 — t 4- tf{ u, t>)] Z(//, v\ 

where (w, v) e £, 0 < / < 1. For fixed the mapping (/, //) -► l F(/, //, v) is a 2-sur- 
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face <J> to which (c) can be applied to show that /*£ = 0. The same thing holds 
when u is fixed. By (a) and Stokes’ theorem, 

f £= f d; = o. 

Jgyy J «1» 

( e ) Put A = — (z/r)rj, where 

_ xdy—y dx 


as in Exercise 21 . Then A is a 1-form in the open set V ^ R 3 in which x 2 + y 2 > 0. 
Show that £ is exact in V by showing that 


£ = tfA. 


(/) Derive (d) from ( e ), without using (c). 

Hint : To begin with, assume 0 < u < rr on E. By (e\ 


f £= f A and f £=f A. 

J dn J s 

Show that the two integrals of A are equal, by using part (d) of Exercise 21, and by 
noting that z/r is the same at £(w, v) as at Q(w, v). 

( g ) Is £ exact in the complement of every line through the origin? 

23. Fix n. Define r* = (x\ 4- * * * 4- xl) 1/2 for 1 < k <> n, let E k be the set of all xe R n 
at which r* > 0, and let co k be the ( k — l)-form defined in E k by 

o>k = (r k )~ k (— 1 ) t ~ 1 x l dx 1 A Adxt-i A dx l + 1 A ••• A dx k . 

Note that co 2 = rj, o> 3 = £, in the terminology of Exercises 21 and 22. Note 
also that 

Ei c E 2 <=""CzE n =R n -{0}. 

(a) Prove that daj k = 0 in E k . 

( b ) For k = 2, . . . , /i, prove that a> k is exact in E k -. lt by showing that 

COk — d(f k OJ k -l) — (dfk) A 
where /*(x) = (-!)* g k (x k /r k ) and 


g k (t)= r (l-j*)«‘-3)/2^ (— 1 < / < 1). 

J- 1 


and 


ifiV/r: /* satisfies the differential equations 
x-(V/*)(x)=0 


(D k f k )(x) = 


(r k y 
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(c) Is c o n exact in E n ? 

id) Note that ( b ) is a generalization of part (e) of Exercise 22. Try to extend some 
of the other assertions of Exercises 21 and 22 to w„, for arbitrary n. 

24. Let cjj =Xtf,(x) dxi be a 1-form of class <€" in a convex open set E <= R”. Assume 
dm = 0 and prove that w is exact in E. by completing the following outline: 

Fix p e E. Define 


/(x) = CO (x 6 E). 

J [P.X) 

Apply Stokes’ theorem to affine-oriented 2-simplexes [p, x, y] in E. Deduce that 
/( y) — /(x) = z (yi — *.) f *i((l — t)\ + /y) 

t = 1 Jo 

for x e E, y e E. Hence (£>,/)(x) = a,(x). 

25. Assume that w is a 1-form in an open set E c= R n such that 

f w = 0 

* v 

for every closed curve y in E. of class *€' ’. Prove that co is exact in E. by imitating 
part of the argument sketched in Exercise 24. 

26 . Assume cu is a 1-form in /? 3 — { 0 }, of class c 6 r and du> =0. Prove that w is exact in 

R> ~ {0}. 

Hint: Every closed continuously differentiable curve in R 3 — {0) is the 
boundary of a 2-surface in R 3 — { 0 }. Apply Stokes’ theorem and Exercise 25. 

27 . Let E be an open 3-cell in R 3 . with edges parallel to the coordinate axes. Suppose 
(a. b. c) eE.fi e <6\E) for i = 1, 2, 3, 

a> =/, dy Adz+fi dz A dx + f 3 dx A dy. 

and assume that = 0 in E. Define 

A = gi dx + g 2 dy 

where 


g x (x. y, z) = f 2 (x, y , 5 ) rfs - f 3 (x, /, c) 

" c ^ b 

gAx, y,z)=— f Mx, y, s) ds, 

J C 

for (x. y , z) e E. Prove that dX = to in E. 

Evaluate these integrals when 0 u = £ and thus find the form A that occurs in 
part (e) of Exercise 22. 
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28. Fix b > a > 0, define 


$0% 0) = 0* cos 9 , r sin 9) 

for a <, r <, b, 0 < 9 < 2tt. (The range of is an annulus in R 2 .) Put a t = x 3 dy , 
and compute both 



and 


L 


a ) 


to verify that they are equal. 

29. Prove the existence of a function a with the properties needed in the proof of 
Theorem 10.38, and prove that the resulting function F is of class (Both 
assertions become trivial if E is an open cell or an open ball, since a can then be 
taken to be a constant. Refer to Theorem 9.42.) 

30. If N is the vector given by (1 35), prove that 


det 


a i 
a 2 


L«3 


j 8 i a 2^3 ~ a 3 jS 2 ' 

$2 «3^1 ““ <*1$3 

f$3 <*i$2 — a 2^ 1 , 


|N| 2 . 


Also, verify Eq. (137). 

31. Let E c: R* be open, suppose g e E ), h e #"(£), and consider the vector field 


F = gVh. 


(a) Prove that 


V-F=gV 2 h + (Vg)-m 


where V 2 h = V • (Vh) = Zd 2 hldxf is the so-called “Laplacian” of h. 

(b) If O is a closed subset of E with positively oriented boundary SQ (as in 
Theorem 10.51), prove that 


| [g^h + (Vg)-(Vh)W= / 9y n dA 


where (as is customary) we have written dh/dn in place of (V/i) • n. (Thus dh/dn 
is the directional derivative of h in the direction of the outward normal to d£ 2, the 
so-called normal derivative of h.) Interchange g and h, subtract the resulting 
formula from the first one, to obtain 

/„(* ™ - h vlg) dv= i ( 9 t ~ *1) dA • 

These two formulas are usually called Green's identities. 

(c) Assume that h is harmonic in E\ this means that V 2 h = 0. Take g = 1 and con- 
clude that 
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Take g = h, and conclude that h = 0 in Q if h = 0 on c ft. 

( d ) Show that Green’s identities are also valid in R 2 . 

32. Fix 8, 0 < 8 < 1 . Let D be the set of all ( 9 , t) e R 2 such that 0 <9 < tt, — 8 < r < 8. 
Let O be the 2-surface in /? 3 , with parameter domain £>, given by 

x = (1 — / sin 9) cos 26 
y = (1 — t sin 9) sin 29 
z = t cos 9 

where (jc, y, z ) = 0(0, t). Note that 0 (tt-, t) = 0(0, — /), and that O is one-to-one 
on the rest of D. 

The range M = O (D) of O is known as a Mobius band . It is the simplest 
example of a nonorientable surface. 

Prove the various assertions made in the following description: Put 
Pi = (0, -8), p 2 = (tt, -8), p 3 = (n, 8), p 4 -- (0, 8), p 5 = p,. Put y, = [p, , p i + 1 ], 
/ = 1, ...» 4, and put T, = O o y i . Then 

00 = A -i-r 2 f r 3 -f r 4 . 

Put a = (1,0, -8), b = (1, 0, 8). Then 

^(Pi) = ^(Pa) = a, 0(p 2 ) = 0(p 4 ) = b, 
and 0O can be described as follows. 

Ti spirals up from a to b; its projection into the (x, jy)-plane has winding 
number + 1 around the origin. (See Exercise 23, Chap. 8.) 

r 2 = [b, a]. 

r 3 spirals up from a to b; its projection into the (x,y) plane has winding 
number —1 around the origin. 
r 4 = [b, a]. 

Thus 0O = -f- r 3 4- 2T 2 . 

If we go from a to b along I\ and continue along the “edge” of M until we 
return to a, the curve traced out is 

r = r 1 -r 3 , 

which may also be represented on the parameter interval [0, 27 t] by the equations 

x = (1 F 8 sin 9) cos 29 
y = (1 F 8 sin 9) sin 29 
z = — 8 cos 9. 

It should be emphasized that T #£<!>: Let rj be the 1-form discussed in 
Exercises 21 and 22. Since drj = 0, Stokes’ theorem shows that 
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But although r is the “geometric” boundary of M, we have 

[ T) = 4iT. 

J p 

In order to avoid this possible source of confusion, Stokes’ formula (Theorem 
10.50) is frequently stated only for orientable surfaces <h. 



11 

THE LEBESGUE THEORY 


It is the purpose of this chapter to present the fundamental concepts of the 
Lebesgue theory of measure and integration and to prove some of the crucial 
theorems in a rather general setting, without obscuring the main lines of the 
development by a mass of comparatively trivial detail. Therefore proofs are 
only sketched in some cases, and some of the easier propositions are stated 
without proof. However, the reader who has become familiar with the tech- 
niques used in the preceding chapters will certainly find no difficulty in supply- 
ing the missing steps. 

The theory of the Lebesgue integral can be developed in several distinct 
ways. Only one of these methods will be discussed here. For alternative 
procedures we refer to the more specialized treatises on integration listed in 
the Bibliography. 


SET FUNCTIONS 

If A and B are any two sets, we write A — B for the set of all elements such 
that x e A, x $ B. The notation A — B does not imply that B c= A. We denote 
the empty set by 0, and say that A and B are disjoint if A n B = 0. 
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11.1 Definition A family ^ of sets is called a ring if A e & and B e implies 

( 1 ) AvBe@, A—Be@. 

Since A n B = A — (A — B), we also have A n B e @ if ^ is a ring. 

A ring @ is called a o-ring if 

(2) 0 A n 6 ® 

n= 1 

whenever A„ e & (n = 1, 2, 3, . . .). Since 

rU-^.-u Mi - 

7i = 1 n = 1 

we also have 

f\A m e» 

n = 1 


if ^ is a (T-ring. 

11.2 Definition We say that </> is a set function defined on & if (j) assigns to 
every A e Jt a number <j>(A ) of the extended real number system. </> is additive 
if A n B = 0 implies 

(3) <t>(A u B) = + 0(B), 

and 0 is countably additive if /( f n /) ( = 0 (/' ^ j ) implies 

(4) 0(0 'O = I 0MJ- 

We shall always assume that the range of </> does not contain both + oo 
and — 00 ; for if it did, the right side of (3) could become meaningless. Also, 
we exclude set functions whose only value is + oo or — oo. 

It is interesting to note that the left side of (4) is independent of the order 
in which the A n 's are arranged. Hence the rearrangement theorem shows that 
the right side of (4) converges absolutely if it converges at all; if it does not 
converge, the partial sums tend to + oo, or to — oo. 

If </> is additive, the following properties are easily verified: 

( 5 ) 0 ( 0 ) = 0 . 

(6) 0(/t,u ••• uA„) = 0(/l,) + ••• + 0(/j„) 


if Aj n Aj = 0 whenever / ^ j. 
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(7) </>(/*! u A 2 ) -I- </>(/*! n A 2 ) = $(A j) + (p(A 2 ). 

If 4>(A) > 0 for all A , and j <= A 2 , then 

(8) 0(^i) < 4>(A 2 ). 

Because of (8), nonnegative additive set functions are often called 
monotonic. 

( 9 ) <KA -B) = 4>{A) - MB) 
if B c= A, and \(4>B)\ < + oo. 

11.3 Theorem Suppose (f> is countably additive on a ring J# . Suppose A n e 3 
(n = 1, 2, 3, . . .), A x a A 2 c A 3 c • • • , A e and 


A={jA„. 

n = 1 

Then , as n ^ oo, 

4>(A m )^<KA). 

Proof Put fi, = /J,, and 

B„ = A n -A n . l (« = 2, 3, . . .). 

Then 5, n Bj = 0 for i ^ j, A„ = B y u ■ ■ • u B„ , and /! = . Hence 

W4„) = I <M*,) 

1 = 1 

and 


OO 


4>(A)=Y. <KB,). 

i= 1 


CONSTRUCTION OF THE LEBESGUE MEASURE 

11.4 Definition Let R p denote /7-dimensional euclidean space. By an interval 
in R p we mean the set of points x = (x Xi . . . , x p ) such that 

(10) a, <*, <£; (/= 1, 

or the set of points which is characterized by (10) with any or all of the < 
signs replaced by <. The possibility that a x = b { for any value of i is not ruled 
out; in particular, the empty set is included among the intervals. 



THE LEBESGUE THEORY 303 


If A is the union of a finite number of intervals, A is said to be an elemen- 
tary set. 

If /is an interval, we define 

«»(/) = n ( b > - «•■)> 

i = i 

no matter whether equality is included or excluded in any of the inequalities (10). 
If A = /j u • • • u /„, and if these intervals are pairwise disjoint, we set 

(11) m(A) = m(J x ) + * * * 4- /«(/„). 

Wc let 6 denote the family of all elementary subsets of R p . 

At this point, the following properties should be verified: 

(12) 6 is a ring, but not a cr-ring. 

(13) If A e <f, then A is the union of a finite number of disjoint intervals. 

(14) If A e 6\ m(A) is well defined by (11); that is. if two different decompo- 
sitions of A into disjoint intervals are used, each gives rise to the same 
value of m(A). 

(15) m is additive on 6. 

Note that if p = 1,2. 3, then m is length, area, and volume, respectively. 

11.5 Definition A nonnegative additive set function 0 defined on & is said to 
be regular if the following is true: To every A e 6 and to every e > 0 there 
exist sets F e 6. G e 6 such that F is closed, G is open, Fa A a <7, and 

(16) 0(G) - e < 0(.4) < 0(F) + c. 

11.6 Examples 

{a) The set function m is regular. 

If A is an interval, it is trivial that the requirements of Definition 
1 1 .5 are satisfied. The general case follows from (13). 

(b) Take R p — R l , and let a be a monotonically increasing func- 
tion, defined for all real .v. Put 

Mk b))=a(b~) -a(o-), 

H([a, b]) = a(b+) - a(a~), 

Mk *]) = «(£+)- »(<»+ ). 

Mk b)) = a(b-)-a.(a + ). 

Here [ a , b) is the set a < x < b, etc. Because of the possible discon- 
tinuities of a, these cases have to be distinguished. If g is defined for 
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elementary sets as in (11), /z is regular on The proof is just like that 
of (a). 

Our next objective is to show that every regular set function on <5 can be 
extended to a countably additive set function on a cr-ring which contains 

11.7 Definition Let /z be additive, regular, nonnegative, and finite on <5\ 
Consider countable coverings of any set E c= R p by open elementary sets A n : 

E<= Q A n . 

n= 1 

»*(£) = inf £ n(A n ), 

n= 1 

the inf being taken over all countable coverings of E by open elementary sets. 
H*{E) is called the outer measure of £, corresponding to /z. 

It is clear that /i*(£) > 0 for all E and that 

(18) *!*(£,)£ 

if Ei cz E 2 . 

11.8 Theorem 

(a) For every A e <f, n*(A) = /<(/!)■ 

(b) If E = Q E„, then 

1 

(19) n*(E) < l fi*(E„). 

n = 1 

Note that (a) asserts that /z* is an extension of /z from S to the family of 
all subsets of R p . The property (19) is called subacid it ivity. 

Proof Choose A e £ and e > 0. 

The regularity of /z shows that A is contained in an open elementary 
set G such that /z(G) < n(A) + e. Since n*(A) < n(G) and since s was 
arbitrary, we have 

(20) fi*(A) < fi(A). 

The definition of shows that there is a sequence {A n } of open 
elementary sets whose union contains A, such that 

£ Ji(A n ) < n*(A) + 6 . 


Define 

(17) 
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The regularity of // shows that A contains a closed elementary set F such 
that n(F) > g(A) — c; and since F is compact, we have 

FcA t u ••• u A n 

for some N. Hence 

N 

li(A) < n(F) + e < fi(A l u • ■ • u A N ) + £ < Y. + £ < fi*(A) + 2c. 

1 

In conjunction with (20), this proves (a). 

Next, suppose E = U £„, and assume that //*(£„) < + oo for all n. 
Given f. > 0, there are coverings {A nk }, k = 1 , 2, 3, , of E n by open 

elementary sets such that 

(2D lM4)<^) + 2-"c. 

k=- 1 

Then 

/<*(£')< X n(A nk )< X /<*(/:„)+ c, 

n = I k = 1 n = 1 

and ( 19 ) follows. In the excluded case, i.e.. if //*(£„) = +oo for some A7, 
( 19 ) is of course trivial. 

11.9 Definition For any A cz R p , B c: R p , we define 

(22) S(A, B) = (A — B) u (B — A). 

(23) d(A.B) = n*(S(A,B)). 

We write A„->A if 

lim r/(/f, ^ n ) = 0. 

m -♦ oo 

If there is a sequence { A n ) of elementary sets such that A n -+A % we say 
that A is finitely //- measurable and write A e'JDi ,-(/<). 

If A is the union of a countable collection of finitely //-measurable sets, 
we say that A is //- measurable and write A e 9.)i(//). 

B) is the so-called “symmetric difTerence ,, of A and B. We shall see 
that cl(A, B) is essentially a distance function. 

The following theorem will enable us to obtain the desired extension of //. 

11.10 Theorem 9.W(/0 is a a-ring , and //* is countably additive on 9JI(//). 

Before we turn to the proof of this theorem, we develop some of the 
properties of S(/f, B) and d(A, B). We have 
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(24) 


S(A. B) = S(£, A ), 

S(/l, /<) = 0. 

(25) 


S(A, B) c: S(A, C) \ 

■j S(C, B). 


S(/f, 

u A 2 , B { u £ 2 )) 


(26) 

■SO, 

n A 2 , £j n £ 2 ) J c= S(/f, 

, fi.) u S(A 2 , B 2 ). 


so. 

— A 2 , B l — £ 2 )J 



(24) is clear, and (25) follows from 


(A - B) a (A - C) u (C - £), (B- A)cz(C- A) kj (B - C). 

The first formula of (26) is obtained from 

(A i u A 2 ) - (B { u B 2 ) c= (A, - B { ) u (^ 2 - £ 2 ). 

Next, writing £ c for the complement of £, we have 
S(A { n A 2 , B 1 n £ 2 ) = 5(/f < j v A c 2 , B' u B c 2 ) 

c S(/^, £0 u S(/f$ , £5) = ^i) u S(A 2 , £ 2 ): 

and the last formula of (26) is obtained if we note that 

/f ! — A 2 = A { n A 2 . 

By (23), (19), and (18), these properties of S(,4, £) imply 


(27) 


r/(/f, £) = 

d(B, A), d(A,A) = 0, 

(28) 


</(/!, 5) < (/O, C) + </(C. fi). 


d(A , 

U /1 2 5 ^1 

u fl 2 )) 

(29) 

d(A { 

n /f 2 . £j 

n fl 2 ) < fl,) + </M 2 


d(A , 

- A 2 , B l 

-*2)] 


The relations (27) and (28) show that c/(A , £) satisfies the requirements 
of Definition 2.15, except that r/(/f, £) = 0 does not imply A = B. For instance, 
if // = jiu A is countable, and £ is empty, we have 

d(A, B) = m*(A) = 0; 

to see this, cover the /7th point of A by an interval /„ such that 

m(l n ) < 2~ n e. 

But if we define two sets A and £ to be equivalent, provided 

d(A, B) = 0, 

we divide the subsets of R p into equivalence classes, and d{A , £) makes the set 
of these equivalence classes into a metric space. S H?, (//) is then obtained as the 
closure of 6. This interpretation is not essential for the proof, but it explains 
the underlying idea. 
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We need one more property of d(A , B ), namely, 

(30) \ti*{A) - < d(A, B), 

if at least one of n*(A), n*(B) is finite. For suppose 0 < n*(B) < n*(A). 
Then (28) shows that 

d(Afi)<d(A i £) + d(B, 0), 

that is, 

H*(A) < d(A, B) + n*(B). 

Since n*(B) is finite, it follows that 

fi*(A) — ii*{B) < d(A, B). 

Proof of Theorem 11.10 Suppose A e93i f (/i), B e 9)I f (/i). Choose {A n }, 
{B n } such that A n efi. B n e 6, A n ->A,B n -> B. Then (29) and (30) show 


that 


(31) 

A „ u B„ -* A \j B, 

(32) 

A„r\ B n ~* A B, 

(33) 

A„ - B„ -* A — B, 

(34) 

->H*(A), 


and fi*(A) < + oo since d(A n < A) -»0. By (31) and (33), $R f (aO is a ring. 
By (7), 

+ ^B„) = fi(A„ u B„) + /i(A„ n B„). 

Letting n -+ oo, we obtain, by (34) and Theorem 1 1 . 8 ( 0 ), 

fi*(A) + /i*(Z?) = n*(A u B) + fi*(A n Z?). 

If /I n Z? = 0, then //*(/! n Z?) = 0. 

It follows that /j* is additive on s ))l h (n). 

Now let g s ))l(n). Then A can be represented as the union of a 
countable collection of disjoint sets of 9M r (/i). For if A = \J A' n with 
A' n e 9)1 F (n), write A x = A and 

A„ = (a; u * • • u /f') - (/f,; u ** • u /f^) ( n = 2, 3, 4, ...). 

Then 

(35) A=(jA n 

n = 1 

is the required representation. By (19) 

H*(A)<tn*(A i n ). 

MS I 


( 36 ) 
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On the other hand, A id A x u • • • u A n ; and by the additivity of 
g* on $Jl F (g) we obtain 

(37) n*(A) > n*(A t u • • • u A a ) = n*(A { ) + • • • + n*(A n ). 

Equations (36) and (37) imply 

(38) v*(A) = £ H*(A„). 

n = 1 

Suppose g*(A) is finite. Put B n — A x u • • • u A n . Then (38) shows 

that 


d{A, B n ) = n*( (J A,)= I f<*(A,) -0 

i = n + 1 i = n + 1 

as n->co. Hence B„-+ A\ and since B n eW F (g), it is easily seen that 
A e 931 f (^). 

We have thus shown that A ed)l F (g) if A e '!)?(/ 0 and g*(A) < -f oo. 
It is now clear that g* is countably additive on K ))l(g). For if 

^ = im.. 

where {^ n } is a sequence of disjoint sets of 9Ji(//), we have shown that (38) 
holds if /i*(^„) < -f oo for every n, and in the other case (38) is trivial. 

Finally, we have to show that Wl(g) is a a-ring. If A n e s ))l(g), w = 1, 
2, 3, . . . , it is clear that (J A n e W(^) (Theorem 2.12). Suppose A e s JN(/0> 
B g SR(fi), and 

n= 1 n — 1 

where A n , B n e %R r (g). Then the identity 

A„nB = 0 (A„ n B,) 

i = 1 

shows that A n n B e Wl(g); and since 

H*(A n n B)< g*(A n ) < + oo, 

A n n B g ^l F (g). Hence A n — B e W F (g), and A — B e W(g) since 

>«-*= U*-* 

We now replace g*(A) by ^(,4) if A e$)l(g). Thus g, originally only de- 
fined on is extended to a countably additive set function on the cr-ring 
Wl(g). This extended set function is called a measure. The special case g = m 
is called the Lebesgue measure on R p . 
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11.11 Remarks 

(i a ) If A is open, then A e s Dl(/x). For every open set in R p is the union 
of a countable collection of open intervals. To see this, it is sufficient to 
construct a countable base whose members are open intervals. 

By taking complements, it follows that every closed set is in $)l(p). 

( b ) If A e s ))l(p) and e > 0, there exist sets F and G such that 

Fa A <= <7, 


£is closed, G is open, and 

(39) p(G -A)<e, p( A — F) < e. 

The first inequality holds since p* was defined by means of coverings 
by open elementary sets. The second inequality then follows by taking 
complements. 

(c) We say that £ is a Borel set if E can be obtained by a countable 
number of operations, starting from open sets, each operation consisting 
in taking unions, intersections, or complements. The collection 3S of all 
Borel sets in R p is a a-ring; in fact, it is the smallest a-ring which contains 
all open sets. By Remark (< a ), E e s ))l(p) if E e 

(d) If A e s M(i /), there exist Borel sets F and G such that £c A c <7, 
and 


(40) 


KG - A) = p(A - F) = 0. 


This follows from ( b ) if we take e = \/n and let n -+ oo. 

Since A = F u (A — £), we see that every A e 9Jt(/*) is the union of a 
Borel set and a set of measure zero. 

The Borel sets are ^-measurable for every p. But the sets of measure 
zero [that is, the sets E for which p*(E) = 0] may be different for different 
p's. 

( e ) For every p , the sets of measure zero form a cr-ring. 

(/) In case of the Lebesgue measure, every countable set has measure 
zero. But there are uncountable (in fact, perfect) sets of measure zero. 
The Cantor set may be taken as an example: Using the notation of Sec. 
2.44, it is easily seen that 


m(E n ) = (i) n (/>= 1,2,3,...); 
and since P = f] E n , P a E n for every n , so that m(P) = 0. 
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MEASURE SPACES 

11.12 Definition Suppose A" is a set, not necessarily a subset of a euclidean 
space, or indeed of any metric space. X is said to be a measure space if there 
exists a cr-ring of subsets of X (which are called measurable sets) and a non- 
negative countably additive set function p (which is called a measure), defined 
on 9N. 

If, in addition, X e 9W, then X is said to be a measurable space. 

For instance, we can take X = R p , the collection of all Lebesgue- 
measurable subsets of R p , and p Lebesgue measure. 

Or, let X be the set of all positive integers, the collection of all subsets 
of X , and p(E) the number of elements of E. 

Another example is provided by probability theory, where events may be 
considered as sets, and the probability of the occurrence of events is an additive 
(or countably additive) set function. 

In the following sections we shall always deal with measurable spaces. 
It should be emphasized that the integration theory which we shall soon discuss 
would not become simpler in any respect if we sacrificed the generality we have 
now attained and restricted ourselves to Lebesgue measure, say, on an interval 
of the real line. In fact, the essential features of the theory are brought out 
with much greater clarity in the more general situation, where it is seen that 
everything depends only on the countable additivity of p on a a-ring. 

It will be convenient to introduce the notation 

(41) {x\ P) 

for the set of all elements x which have the property P. 


MEASURABLE FUNCTIONS 

11.13 Definition Let /be a function defined on the measurable space X , with 
values in the extended real number system. The function /is said to be measur- 
able if the set 

(42) {x\ f(x)>a) 

is measurable for every real a. 


11.14 Example If X = R p and s JR=9JI(aO as defined in Definition 11.9, 
every continuous /is measurable, since then (42) is an open set. 
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11.15 Theorem Each of the following four conditions implies the other three : 

(43) {*|/M > a) is measurable for every real a. 

(44) {x\ f(x) > a} is measurable for every real a. 

(45) {x\ f{x) < a) is measurable for every real a. 

(46) {x\ f(x) < a} is measurable for every real a. 

Proof The relations 

{•*1 /M > a) = n (*I/M > a - , 

{• x | f(x) < a) = X - {x | f{x) > a), 

{^|/W < a) = f] \x\f(x) < a + -) , 
n = 1 l «l 

{x \f(x) > a) = X - {x | f{x) < a } 

show successively that (43) implies (44), (44) implies (45), (45) implies 
(46), and (46) implies (43). 

Hence any of these conditions may be used instead of (42) to define 
measurability. 

11.16 Theorem If f is measurable, then \f\ is measurable. 

Proof 

{*1 l/M I <a} = {x\f{x) <a) n {x\f(x) > - a}. 

11.17 Theorem Ld [ /„} be a sequence of measurable junctions. For x e X, put 

g(x) = sup/„(x) (» = 1 , 2, 3, . . .), 
h(x) = lim sup/„(x). 

n~* ao 

Then g and h are measurable . 

The same is of course true of the inf and lim inf. 

Proof 

M^M > a} = 0 {x\f n {x) > a}, 

n = 1 

h(x) = inf g m (x), 
where g m (x) = supf„(x) (n > m). 
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Corollaries 

(a) Iff and g are measurable , then max (f g) and min (/, g) are measurable . 

if 

(47) / + = max (/, 0), f~ = - min (/, 0), 

it follows , in particular, that f + andf~ are measurable. 

(b) The limit of a convergent sequence of measurable functions is measurable. 


11.18 Theorem Let f and g be measurable real-valued functions defined on X , 
let F be real and continuous on R 2 , and put 

h(x) = F(f (x), g(x)) (x e X). 

Then h is measurable. 

In particular , / + g and fg are measurable. 

Proof Let 

G a = {(u,v)\F(u,v)> a). 

Then G a is an open subset of R 2 , and we can write 


c*= U A. 


where {/„} is a sequence of open intervals: 


Since 


In = ((«, v)\a n <u<b n ,c n <v < d n ) \. 


{x\a n <f(x) < b„} = (x|/(x) > a n } n {x\ f(x) < b n } 
is measurable, it follows that the set 

{x | (fix), g(x)) e /„} ={x\a n < f(x) < b n ) n {x|c„ < g(x) < d n } 
is measurable. Hence the same is true of 

{*1 h(x) > a) =--■ (x|(/(x), g(x)) e G a } 

= Q {Jf|(/W,^(x)) e/„}. 

o-l 

Summing up, we may say that all ordinary operations of analysis, includ- 
ing limit operations, when applied to measurable functions, lead to measurable 
functions; in other words, all functions that are ordinarily met with are measur- 
able. 

That this is, however, only a rough statement is shown by the following 
example (based on Lebesgue measure, on the real line): If h(x) =f(g(x)), where 
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/ is measurable and g is continuous, then h is not necessarily measurable. 
(For the details, we refer to McShane, page 241.) 

The reader may have noticed that measure has not been mentioned in 
our discussion of measurable functions. In fact, the class of measurable func- 
tions on X depends only on the cr-ring S JJJ (using the notation of Definition 11.12). 
For instance, we may speak of Bor el -measurable functions on R p , that is, of 
function / for which 

M/M > a} 

is always a Borel set, without reference to any particular measure. 


SIMPLE FUNCTIONS 

11.19 Definition Let s be a real-valued function defined on X. If the range 
of s is finite, we say that s is a simple function. 

Let E a X, and put 

< 48 > TtE). 

Ke is called the characteristic function of E. 

Suppose the range of s consists of the distinct numbers c l9 ...,c„. Let 

£; = M*M = C,} (7=1,...,/?). 

Then 

(49) s=tciK Ei , 

n — 1 

that is, every simple function is a finite linear combination of characteristic 
functions. It is clear that s is measurable if and only if the sets E l% . . ., E n are 
measurable. 

It is of interest that every function can be approximated by simple 
functions: 

11.20 Theorem Let f be a real function on X. There exist* a sequence {5„} of 
simple functions such that s n (x) — ► f{x) as n -> oo, for every x e X. If f is measur- 
able , {5„} may be chosen to be a sequence of measurable functions . If f> 0, { s n } 
may be chosen to be a monotonically increasing sequence. 

Proof If/ > 0, define 

E«, = {* -jr </(•*) < . 


F n = M/M > «} 
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for n = 1, 2, 3 = 1,2 w2". Put 

(50) Sn = f l J± K +nK p n . 

i= 1 ^ 

In the general case, let f =f + — /“, and apply the preceding construction 
to f + and to /“. 

It may be noted that the sequence {s n } given by (50) converges 
uniformly to /if /is bounded. 


INTEGRATION 

We shall define integration on a measurable space X , in which 9JI is the a-ring 
of measurable sets, and g is the measure. The reader who wishes to visualize 
a more concrete situation may think of X as the real line, or an interval, and of 
\x as the Lebesgue measure m . 

11.21 Definition Suppose 

(51) s(x) = £ C; K El (x) (xeX,c t > 0) 

i= 1 

is measurable, and suppose E e We define 

(52) I E (s)= t Cin(E n £,). 

i = 1 

If / is measurable and nonnegative, we define 

(53) f fdfi = sup 7 £ (5), 

J E 

where the sup is taken over all measurable simple functions 5 such that 0 < 5 < /. 

The left member of (53) is called the Lebesgue integral of / with respect 
to the measure //, over the set £. It should be noted that ihe integral may have 
the value + 00 . 

It is easily verified that 

(54) f sdjx = / £ (5) 

J E 

for every non negative simple measurable function 5. 

11.22 Definition Let /be measurable, and consider the two integrals 

(55) f f + dg y f /- dfi y 

J E J E 

where / + and f~ are defined as in (47). 
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If at least one of the integrals (55) is finite, we define 
(56) f fdn = f / + dn - f f'dfi. 

J E J E J E 

If both integrals in (55) are finite, then (56) is finite, and we say that / is 
integrable (or summable) on £ in the Lebesgue sense, with respect to p; we write 
/ g ££(p) on E. If g = m , the usual notation is : / g if on E. 

This terminology may be a little confusing: If (56) is +oo or — oo, then 
the integral of / over E is defined, although / is not integrable in the above 
sense of the word; /is integrable on E only if its integral over E is finite. 

We shall be mainly interested in integrable functions, although in some 
cases it is desirable to deal with the more general situation. 


11.23 Remarks The following properties are evident: 

{a) If / is measurable and bounded on £, and if g(E) < +oo, then 
/ g &(p) on E. 

(b) If a < f(x) < b for x g £, and g(E) < + oo, then 

afi(E) < f fdn < bfi(E). 

J E 

(c) If / and g e on £, and if f{x) < g(x) for x e E, then 

f £ fdn < 9 dn- 

(d) If / g on £, then cf e ^(g) on £, for every finite constant c, and 

cf dn = c \jdn- 

(e) If g(E) = 0, and / is measurable, then 

[ fdg = 0. 

J E 

(/) If /g ££(g) on E, A e 9M, and /!<=£, then/ g ££(g) on A. 

11.24 Theorem 

(a) Suppose f is measurable and nonnegative on X. For A e 9JJ, define 

<KA) = \ fdg. 

J A 


( 57 ) 
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Then (p is countably additive on 
( b ) The same conclusion holds if f e on X. 

Proof It is clear that ( b ) follows from (a) if we write / = / + — /" and 
apply (a) to f + and to/". 

To prove (a), we have to show that 

( 58 ) <KA) = £ <kA m ) 

n = I 

if A„ e s M (n = 1, 2, 3, . . .), A ,■ n Aj = 0 for / ^ y, and A = (Jf A n . 

If / is a characteristic function, then the countable additivity of (p is 
precisely the same as the countable additivity of /i, since 

f K E dfi = ti(A n£). 

J A 

If /is simple, then /is of the form (51), and the conclusion again 

holds. 

In the general case, we have, for every measurable simple function s 
such that 0 < s < /, 


[ j dn = £ f sdn<f_ <I>(A„). 

J A n= 1 J A n n = l 

Therefore, by (53), 


(59) 4>(A) < £ 

n = 1 

Now if (p(A n ) = +oo for some n . (58) is trivial, since 0(/I) > </>(/!„). 
Suppose </>(/*„) < + oo for every n. 

Given e > 0, we can choose a measurable function s such that 
0 < s </, and such that 

(60) f sdn>\ fdn - e, f sd^i>\ fdfi - e. 

Hence 

(p(A { u >4 2 ) > f s d\i = [ sdji + f s d\i> <p(A x ) -I- <p(A 2 ) — 2e, 
so that 


u A 2 )jt> (p(A { ) + <p(A 2 ). 
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It follows that we have, for every n , 

(61) (/>(/!, u • • • u A n ) > (piAi) + • • • + <p(A„). 

Since A A t u * * * u A n , (61) implies 

(62) 4>M) ^ £ MU 

n = 1 

and (58) follows from (59) and (62). 

Corollary If A e 931, B <= A, and — B) = 0, then 

f fdn = [ f d[i. 

J A J B 

Since A = B u (A — /?), this follows from Remark 1 1.23(c). 

11.25 Remarks The preceding corollary shows that sets of measure zero are 
negligible in integration. 

Let us write f ^ g on E if the set 

{x\f(x) ^ g{x)\ n £ 


has measure zero. 

Then / ~ /; / — g implies g ~~ /; and / ^ g, g ~ h implies f h. That is, 
the relation ~ is an equivalence relation. 

If/~ g on E, we clearly have 

\/ d ^ = j A g dg ' 

provided the integrals exist, for every measurable subset A of E. 

If a property P holds for every x e E — A, and if g(A) = 0, it is customary 
to say that P holds for almost all x e E, or that P holds almost everywhere on 
E. (This concept of “almost everywhere” depends of course on the particular 
measure under consideration. In the literature, unless something is said to the 
contrary, it usually refers to Lebesgue measure.) 

If / g SP(p) on E , it is clear that f(x) must be finite almost everywhere on E. 
In most cases we therefore do not lose any generality if we assume the given 
functions to be finite-valued from the outset. 


11.26 Theorem If f e ££(g) on E, then \f\ e £f(g) on E, and 

f fdn I < f |/| d[i. 

J F J E 


( 63 ) 
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Proof Write E = A u B, where f(x)> 0 on A and f(x) < 0 on B. 

By Theorem 1 1 .24, 

f l/l dn=\ |/| dn + f |/| dn = f / + dn + [ /" dfi < + oo, 

J E J A J B J A J B 

so that |/| e Since / < |/| and — / < |/|, we see that 

f fdn < f |/| - f fdn < f |/| dp, 

and (63) follows. 

Since the integrability of / implies that of |/| , the Lebesgue integral is 
often called an absolutely convergent integral. It is of course possible to define 
nonabsolutely convergent integrals, and in the treatment of some problems it is 
essential to do so. But these integrals lack some of the most useful properties 
of the Lebesgue integral and play a somewhat less important role in analysis. 


11.27 Theorem Suppose f is measurable on £, |/| < g, and g e ^(p) on E. 
Then f e £T(p) on E. 

Proof We have f + <g and f~ <g. 

11.28 Lebesgue’s monotone convergence theorem Suppose E e s J0L Let {/„} be 
a sequence of measurable functions such that 

(64) 0 <Mx)<f 2 (x)<--- (xeE). 

Let f be defined by 

(65) f n (x) -/(*) (x e E) 
as n co. Then 

(66) | £ /„ dn (n ->• oo). 

Proof By (64) it is clear that, as n -► oo, 

f L a 

J E 

for some a; and since J/„ < J/, we have 


( 67 ) 
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Choose c such that 0 <c < 1, and let s be a simple measurable 
function such that 0 < s < /. Put 

E n = {x\fn(x) > «(*)} (n = 1, 2, 3 , . . .)• 

By (64), E x c= E 2 <= E 3 <= • • • ; and by (65), 

(69) E=(jE n . 

n= 1 

For every n , 


(70) f /„ > f f n dn>c\ sdn. 

J E J E n J E n 

We let n -> oo in (70). Since the integral is a countably additive set function 
(Theorem 1 1 .24), (69) shows that we may apply Theorem 11.3 to the last 
integral in (70), and we obtain 

(71) a >c( s dfi. 

J E 

Letting c -► 1, we see that 



and (53) implies 

(72) a > [ fdfi. 

J E 

The theorem follows from (67), (68), and (72). 

11.29 Theorem Suppose f = f x +/ 2 , where f t e ££(p) on E (i = 1, 2). Then 
f e &(p) on E, and 

(73) f fdn= [ f 1 d f i+ [ f 2 dn. 

J E J E J E 

Proof First, suppose /i > 0 ,/ 2 > 0. If and f 2 are simple, (73) follows 
trivially from (52) and (54). Otherwise, choose monotonically increasing 
sequences {^}, { 5 "} of nonnegative measurable simple functions which 
converge to /i,/ 2 . Theorem 11.20 shows that this is possible. Put 
s„ = s'„ + si Then 

f s *dn= f s' n dn+ f s'^dn, 

J E J E J E 

and (73) follows if we let n -► 00 and appeal to Theorem 11.28. 
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Next, suppose > 0 ,/ 2 < 0. Put 

A={x |/(x)>0}, B = {x\f(x) < 0}. 

Then and — / 2 are nonnegative on A. Hence 

(74) f /W/i = f fdn + f (-/ 2 ) din = f /^ - f / 2 did. 

J A J A J A J A J A 

Similarly, — f 9 f l9 and — f 2 are nonnegative on B , so that 

f ( —fi) did = f /i dn + f (-f)dn, 

J B J B 

or 

(75) f /i 4“ = f fdn - f / 2 

^ B J B 

and (73) follows if we add (74) and (75). 

In the general case, E can be decomposed into four sets £, on each 
of which /i(x) and f 2 (x) are of constant sign. The two cases we have proved 
so far imply 

J* £ fdn = /, a'/i + / 2 dn (/ = 1, 2, 3, 4), 

and (73) follows by adding these four equations. 

We are now in a position to reformulate Theorem 11.28 for series. 

11.30 Theorem Suppose E e S D1. //'{/„} is a sequence of nonnegative measurable 
functions and 

(76) /(*) = !/.(*) (*e£). 

II = 1 

then 

f f dfi = £ [ J„dn. 

J E n= 1 J E 

Proof The partial sums of (76) form a monotonically increasing sequence. 

11.31 Fatou’s theorem Suppose EeWi. 1J {/„} /s’ a sequence of nonnegative 
measurable functions and 

f(x) = lim inf/„(x) (* e £), 






then 

ill) 


n-> oc 
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Strict inequality may hold in (77). An example is given in Exercise 5. 
Proof For n = 1, 2, 3, . . . and x e E, put 

g„(x) = inf fix) (,' > n). 

Then g„ is measurable on E, and 

(78) 0<g,(x)<g 2 (x)< -, 

(79) g„(x) </„(*), 

(80) g„(x) -> f(x) (n --*• oo ). 

By (78), (80), and Theorem 1 1 .28, 

(81) J^g„ dft ^ f/dg, 
so that (77) follows from (79) and (81). 

11.32 Lebesgue’s dominated convergence theorem Suppose E e 3M. Let {/„} be 
a sequence of measurable functions such that 

(82) f n (x) —*f(x) {xeE) 

as n -> oo. If there exists a function g e Sf(jjL) on £, such that 

(83) |/.(-v)| <g(x) (n= 1.2, 3 x e E), 

then 

(84) lim f f n dp = f fdu. 

i,-oo J E J E 

Because of (83), {f„} is said to be dominated by g , and we talk about 
dominated convergence. By Remark 11.25, the conclusion is the same if (82) 
holds almost everywhere on E. 

Proof First, (83) and Theorem 1 1.27 imply that f n e f£\p) and fe ££(g) 
on E. 

Since f n + g > 0, Fatou's theorem shows that 

f (/ 4- 9) dg < lim inf f (/„ + g) dg. 

J E n- oo J E 


or 


r fdg < lim inf f f„ d\i. 

E n— oo ^ E 


( 85 ) 
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Since g —f n > 0, we see similarly that 

f (ff -/) dfi < lim inf f (g - f n ) dg, 

J E tl —* OO J E 

so that 

- f fdg < lim inf [- f /„ dg 1 , 

J E n~> OO L J E J 

which is the same as 

(86) f / dg> lim sup f / dg. 

•'E n—*oo J E 

The existence of the limit in (84) and the equality asserted by (84) 
now follow from (85) and (86). 

Corollary If g(E) < 4- oo, {/„} is uniformly bounded on £, and f n (x) -> / (x) on E, 
then (84) holds. 

A uniformly bounded convergent sequence is often said to be boundedly 
convergent. 


COMPARISON WITH THE RIEMANN INTEGRAL 

Our next theorem will show that every function which is Riemann-integrable 
on an interval is also Lebesgue-integrable, and that Riemann-integrable func- 
tions are subject to rather stringent continuity conditions. Quite apart from the 
fact that the Lebesgue theory therefore enables us to integrate a much larger 
class of functions, its greatest advantage lies perhaps in the ease with which 
many limit operations can be handled; from this point of view, Lebesgue’s 
convergence theorems may well be regarded as the core of the Lebesgue theory. 

One of the difficulties which is encountered in the Riemann theory is 
that limits of Riemann-integrable functions (or even continuous functions) 
may fail to be Riemann-integrable. This difficulty is now almost eliminated, 
since limits of measurable functions are always measurable. 

Let the measure space X be the interval [< a , b] of the real line, with g = m 
(the Lebesgue measure), and the family of Lebesgue-measurable subsets 
of [a, b\ Instead of 

f fdm 

J x 

it is customary to use the familiar notation 
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for the Lebesgue integral of / over [a, b]. To distinguish Riemann integrals 
from Lebesgue integrals, we shall now denote the former by 

@ ffdx. 

J a 


11.33 Theorem 

(a) If f e & on [a, b ], then f e If on [a, b], and 

(87) ( fdx = ®\ b fdx. 

(b) Suppose f is bounded on [a, b\ Then f e & on [ a , b] if and only if f is 
continuous almost everywhere on [a, b]. 

Proof Suppose /is bounded. By Definition 6.1 and Theorem 6.4 there 
is a sequence {P k } of partitions of [a, b], such that P k+1 is a refinement 
of P ki such that the distance between adjacent points of P k is less than 
1/A:, and such that 

(88) lim L(P k J) = ®\ fdx , lim U(P k J) = & \fdx. 

k-*co L k-*ao J 

(In this proof, all integrals are taken over [a, b].) 

If P k = {x 0 , *!, . . . , *„}, with x 0 = a, x n = b , define 


U k (a) = L k (a) =f(a); 

put U k (x) = M i and L k (x) = m, for x { - x < x <x i9 \ < i <n, using the 
notation introduced in Definition 6.1. Then 

(89) L(P k ,/) = jL k dx, U(P k ,/) =\u k dx, 

and 

(90) L y (x) < L 2 {x) < • • • <f(x) < < U 2 (x) < U y (x) 

for all x e [a, b ), since P k + l refines P k . By (90), there exist 

(91) L(x) = lim L k (x), U(x) = lim U k (x). 

k oo k~* oo 

Observe that L and U are bounded measurable functions on [a, b ), 

that 


(92) 


L(x) <f(x) < U(x) (a < x < b). 
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and that 


(93) jLdx = 3?jfdx, judx = 3tjfdx, 

by (88), (90), and the monotone convergence theorem. 

So far, nothing has been assumed about / except that /is a bounded 
real function on [, a , b]. 

To complete the proof, note that / e $ if and only if its upper and 
lower Riemann integrals are equal, hence if and only if 

(94) ^Ldx = ^Udx\ 

since L < U, (94) happens if and only if L(x) = U(x) for almost all 
x € [a, b] (Exercise 1). 

In that case, (92) implies that 

(95) L(x) =/(*) = U(x) 

almost everywhere on [a, b\, so that / is measurable, and (87) follows 
from (93) and (95). 

Furthermore, if x belongs to no P k , it is quite easy to see that U(x) = 
L(x) if and only if /is continuous at x. Since the union of the sets P k is count- 
able, its measure is 0, and we conclude that /is continuous almost every- 
where on l a , b] if and only if L(x) = L\x) almost everywhere, hence 
(as we saw above) if and only if/ e 0t. 

This completes the proof. 

The familiar connection between integration and differentiation is to a 
large degree carried over into the Lebesgue theory. Jf/e if on [a, b\. and 

(96) F(x)=ffdt (a < x < b), 

J a 

then F\x) = f(x) almost everywhere on [< a , b\ 

Conversely, if F is differentiable at every point of [a, b] (“almost every- 
where” is not good enough here!) and if F' e J? on [a, b\, then 

F(x) - F(a) = f F(t) (a < x < b). 

For the proofs of these two theorems, we refer the reader to any of the 
works on integration cited in the Bibliography. 
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INTEGRATION OF COMPLEX FUNCTIONS 

Suppose / is a complex-valued function defined on a measure space X , and 
f = u - h iv, where u and v are real. We say that / is measurable if and only if 
both u and v are measurable. 

It is easy to verify that sums and products of complex measurable functions 
are again measurable. Since 

|/| = (U 2 + V 2 )*'\ 

Theorem 11.18 shows that |/| is measurable for every complex measurable / 
Suppose g is a measure on X, E is a measurable subset of X , and / is a 
complex function on X. We say that / e <Sf(g) on E provided that /is measurable 
and 

( 97 ) f |/| c/v < +oo 9 

J E 

and we define 


| fdfi=jud/i + ij v dg 

if (97) holds. Since \u\ < |/|, |*;| < |/|, and |/| < | u\ + | r| , it is clear that 
(97) holds if and only if u e &(n) and v e J^(^) on E. 

Theorems 11.23(a), (r/), (e), (/), 11.24(^), 11.26, 11.27, 11.29, and 11.32 
can now be extended to Lebesgue integrals of complex functions. The proofs 
are quite straightforward. That of Theorem 11.26 is the only one that offers 
anything of interest: 

If / g ££(n) on £*, there is a complex number c, \c\ = 1, such that 


c f / dp > 0. 

J E 

Put g = cf = u + iv , u and v real. Then 

f f dg = c f fdg = f g dg = f u dg < f |/| dg. 

J E J E J E J E J E 

The third of the above equalities holds since the preceding ones show that 
\g dg is real. 


FUNCTIONS OF CLASS if 2 

As an application of the Lebesgue theory, we shall now extend the Parseval 
theorem (which we proved only for Riemann-integrable functions in Chap. 8) 
and prove the Riesz-Fischer theorem for orthonormal sets of functions. 
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11.34 Definition Let A" be a measurable space. We say that a complex 
function / e & 2 (ji) on X if / is measurable and if 

f \f\ 2 dn< + qo. 

J X 

If n is Lebesgue measure, we say / e 5 £ 2 . For / e i? 2 (/i) (we shall omit the 
phrase “on X ” from now on) we define 

mi = {jj/i 2 ^} ,/ ‘ 

and call ||/|| the i? 2 (/i) norm of /. 

11.35 Theorem Suppose f e S£ 2 (p) and g e £T 2 (p). Then fg e and 

(98) [ \fg\ dp < ||/ ! ||*||. 

J x 

This is the Schwarz inequality, which we have already encountered for 
series and for Riemann integrals. It follows from the inequality 

0 < f (|/| + A|^|) 2 dn = ll/ll 2 + 2). f \fg\ dv + ). 2 \\g\\ 2 , 

J x J x 

which holds for every real A. 

11.36 Theorem Iffe ^ 2 {p) and g e & 2 (p), then f + g e i^ 2 (//), and 

\\f+9\\<\\f\\ + \\9l 

Proof The Schwarz inequality shows that 

\\f+g\\ 2 =j I/I 2 + jfg + \fg + { \g\ 2 
< ll/ll 2 + 2II/H ll^i! + M 2 
= 01/11 + Ml) 2 . 

11.37 Remark If we define the distance between two functions / and g in 
i? 2 (/i) to be ||/ — g ||, we see that the conditions of Definition 2.15 are satisfied, 
except for the fact that \\f — g\\ = 0 does not imply that f(x) = g(x) for all x, 
but only for almost all x. Thus, if we identify functions which differ only on a 
set of measure zero, ^ 2 (p) is a metric space. 

We now consider S£ 2 on an interval of the real line, with respect to 
Lebesgue measure. 

11.38 Theorem The continuous functions form a dense subset of 5£ 2 on [ a , b]. 
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More explicitly, this means that for any / e if 2 on [a, b\, and any £ > 0, 
there is a function g , continuous on [a, b], such that 

\\f-ff\\ = |j ( f-g ) 2 dx | < e. 

Proof We shall say that / is approximated in if 2 by a sequence {g„} if 
II/- 0.11 -»0 as « -» oo. 

Let ^ be a closed subset of [a, b], and K A its characteristic function. 
Put 


and 


t(x) = inf \x -,y| (ye A) 


9n( X ) = 


1 

1 + nt(x) 


(« = 1,2, 3,...). 


Then g„ is continuous on [a, b], g n (x) = 1 on A, and g„(x)^> 0 on B, 
where B = [a, b] — A. Hence 

II 9n ~KJ = [j/n dx } W2 -0 

by Theorem 11.32. Thus characteristic functions of closed sets can be 
approximated in if 2 by continuous functions. 

By (39) the same is true for the characteristic function of any 
measurable set, and hence also for simple measurable functions. 

If / > 0 and / g if 2 , let {.?„} be a monotonically increasing sequence 
of simple nonnegative measurable functions such that 5 n (x)->>/(x). 
Since \f — s n \ 2 < / 2 , Theorem 1 1.32 shows that || /— s„\\ -+0. 

The general case follows. 


11.39 Definition We say that a sequence of complex functions {</>„} is an 
orthonormal set of functions on a measurable space X if 


(n t* m), 

m). 

In particular, we must have </>„ e & 2 (g). If/e if 2 (/i) and if 


I- 

ave </> n g S£- 2 I 

^ = f ffindp (n = 1, 2, 3, ...), 
J x 


we write 


f~ Z C n4>n, 

n = 1 


as in Definition 8.10. 
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The definition of a trigonometric Fourier series is extended in the same 
way to if 2 (or even to if) on [ — n,7i]. Theorems 8.11 and 8.12 (the Bessel 
inequality) hold for any f e 2 (p). The proofs are the same, word for word. 
We can now prove the Parseval theorem. 


11.40 Theorem Suppose 

(99) f{x) ~ f c„e inx , 

— 00 

where f e if 2 on [ — 7r, tt]. Let s n be the nth partial sum of (99). Then 


(100) 

lim ||/- jJ =0, 

n~* oo 

(101) 

Ik |2 = Lj" \f\*dx. 

— OO J _ n 


Proof Let e > 0 be given. By Theorem 11.38, there is a continuous 
function g such that 


\\f~9\\ <\- 

Moreover, it is easy to see that we can arrange it so that g(n) = g( — n). 
Then g can be extended to a periodic continuous function. By Theorem 
8.16, there is a trigonometric polynomial T , of degree N, say, such that 

\\9-n < e ~- 

Hence, by Theorem 8.11 (extended to if 2 ), n > N implies 

Ik -/|| < || r -/|| <e, 

and (100) follows. Equation (101) is deduced from (100) as in the proof of 
Theorem 8.16. 


Corollary Iff e if 2 on [ — 7r, tt], and if 

f f(x)e~ inx dx = 0 (n = 0, ± 1 , ±2, . . .), 

J -n 

then ||/|| = 0. 

Thus if two functions in if 2 have the same Fourier series, they differ at 
most on a set of measure zero. 
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11.41 Definition Let / and f n e & 2 (ii) (n = 1, 2, 3, . . .). We say that {/„} 
converges to /in Sf 2 (p) if || f n — f\\ ->0. We say that {/„} is a Cauchy sequence 
in & 2 (p) if for every e > 0 there is an integer N such that n ^ N, N implies 

II/. ~/J< 


11,42 Theorem If {/„} is a Cauchy sequence in S£ 2 (p), then there exists a 
function f e &\\£) such that{f n } converges to f in 

This says, in other words, that ££ 2 {n) is a complete metric space. 

Proof Since {/„} is a Cauchy sequence, we can find a sequence {/**}, 
k— 1, 2, 3, ... , such that 

ll/*-/^J<p (* = 1.2,3,...). 

Choose a function g e S£ 2 (p). By the Schwarz inequality, 

J x !«(/*-/*♦.) 

Hence 

002) I f |p(/n k -/» k+1 )l dn<\\g\\. 

k= 1 J X 

By Theorem 11.30, we may interchange the summation and integration in 
(102). It follows that 

(103) I ff(x) | £ | Ux) -f^Spc ) | < + oo 

k= 1 

almost everywhere on X. Therefore 

(104) £ !/«,..(*) -/,»(*)! < + co 

k= 1 

almost everywhere on X. For if the series in (104) were divergent on a 
set E of positive measure, we could take g{x) to be nonzero on a subset of 
E of positive measure, thus obtaining a contradiction to (103). 

Since the &th partial sum of the series 

I (/„*.,(*) -/„„(*)), 

k= 1 

which converges almost everywhere on X, is 

,(*) -/„,(*), 
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we see that the equation 

/(*) = lim f„ k (x) 

k-> oo 

defines f(x) for almost all x e X, and it does not matter how we define 
f(x) at the remaining points of X. 

We shall now show that this function / has the desired properties. 
Let £ > 0 be given, and choose N as indicated in Definition 11.41. If 
n k > N, Fatou’s theorem shows that 

II/-/, Jl £ lim inf II/., -/J £ c. 

i-» oo 

Thus/-/,* e and since /= (/-/J +/.„, we see that/e 

Also, since £ is arbitrary, 

lim II/-/J =0. 

k-*cc 


Finally, the inequality 

(105) II/-/JI ^ II/-/JI + II/* -/Jl 

shows that {/„} converges to / in j£? 2 (aO; for if we take w and n k large 
enough, each of the two terms on the right of (105) can be made arbi- 
trarily small. 


11.43 The Riesz-Fischer theorem Let {</>„} be orthonormal on X. Suppose 
£|c„| 2 converges , and put s n = c i (f) l + • • • + c n </>„ . Then there exists a function 
f e ST 2 (p) such that {j n } converges to fin & 2 (p), and such that 

n= 1 

Proof For n > m, 

Ikn — ^mll 2 = I C m + 1 I + * ' ' + | | > 

so that {s n } is a Cauchy sequence in & 2 (p). By Theorem 11.42, there is 
a function / e & 2 {p) such that 

lim ||/ -J n || =0. 


f f$k df*~c k =\ f$ k dn - f sj k dn, 

X J x J x 


Now, for n > k. 
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so that 

( f$ k dn - c k < ||/- ,J • \\<j> k \\ + ||/- sj. • 

J X 

Letting n -+ o o, we see that 

C k = f f$kdp ( k = 1,2,3,...). 

J x 

and the proof is complete. 

11.44 Definition An orthonormal set {</>„} is said to be complete if, for 
/ e & 2 (p), the equations 

f f$ndp = 0 (n= 1,2,3,...) 

J x 

imply that ||/|| = 0. 

In the Corollary to Theorem 11.40 we deduced the completeness of the 
trigonometric system from the Parseval equation (101). Conversely, the Parseval 
equation holds for every complete orthonormal set: 

11.45 Theorem Let {</>„} be a complete orthonormal set. If f e££ 2 (p) and if 

(106) /-£*♦., 

n = 1 

then 

(107) f |/| 2 ^= I kl 2 . 

J X n= 1 

Proof By the Bessel inequality, 1 1 c n | 2 converges. Putting 

= Cl<t>l + *** + C n (p n , 

the Riesz-Fischer theorem shows that there is a function g e ^ 2 (p) such 
that 

(108) 

n= 1 

and such that \\g — jJ ->0. Hence ||jJ| -+ ||^||. Since 

Ikll 2 = kil 2 + ••• + k„| 2 , 

we have 

J \g\ 2 dfi = f, \c „ | 2 . 

J X n— 1 


( 109 ) 
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Now (106), (108), and the completeness of {< fr n } show that \\f — g || = 0, 
so that (109) implies (107). 

Combining Theorems 11.43 and 11.45, we arrive at the very interesting 
conclusion that every complete orthonormal set induces a 1-1 correspondence 
between the functions / e & 2 (g) (identifying those which are equal almost 
everywhere) on the one hand and the sequences {c„} for which I \c n \ 2 converges, 
on the other. The representation 


f~ Z 

n= 1 

together with the Parseval equation, shows that & 2 {\i) may be regarded as an 
infinite-dimensional euclidean space (the so-called “Hilbert space”), in which 
the point / has coordinates c„, and the functions are the coordinate vectors. 


EXERCISES 


1. If/> 0 and j E fdfji = 0, prove that f(x) = 0 almost everywhere on E. Hint: Let E„ 
be the subset of Eon which/(jc) > \/n. Write A = (JE„. Then /x(/t) = 0 if and only 
if /x(E n ) = 0 for every n. 

2. If j A f d[j. = 0 for every measurable subset A of a measurable set E, then fix) = - 0 
almost everywhere on E. 

3. If {/„} is a sequence of measurable functions, prove that the set of points x at 
which {/„(*)} converges is measurable. 

4. If f e JSf(jLi) on E and g is bounded and measurable on E, then fg e ^ifi) on E. 

5. Put 


Show that 


but 



fiM = g(x) 
fik + M = g(\ - x) 


(0 <*<£), 
(| <*< 1 ), 
(0<x<l), 
(0 <* < 1 ). 


lim inf f„(x) = 0 (0<x<l), 


[Compare with (77).] 


J o f„(x) dx = i. 
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6. Let 

fM-fc (|x|s: " ) ’ 

lo (|jt|>»). 

Then f„(x)->0 uniformly on R', but 

j”"f'dx = 2 (n= 1 , 2 , 3 ,...). 

(We write /*«, in place of J*i ) Thus uniform convergence does not imply domi- 
nated convergence in the sense of Theorem 11.32. However, on sets of finite 
measure, uniformly convergent sequences of bounded functions do satisfy Theo- 
rem 11.32. 

7. Find a necessary and sufficient condition that /e ^?(a) on [ a , b]. Hint: Consider 
Example 1 1 .6 (b) and Theorem 1 1 .33. 

8. If f e £ on [< a , b] and if F(x) = Jj f(t)dt y prove that F'(x) = f(x) almost every- 
where on [a y b]. 

9. Prove that the function F given by (96) is continuous on [a y b]. 

10. If g.(X) < + oo and f e £F 2 ig) on X y prove that / e on X. If 

H(X)= +oc, 

this is false. For instance, if 


then / e f£ 2 on R\ but f $ on R x . 

11 . If/, g e ^(/i) on X , define the distance between / and g by 

\ f-a\ d\ L - 

Prove that is a complete metric space. 

12. Suppose 

(a) | fix, y) | <1 if 0<*<l,0<y< 1, 

( b ) for fixed x,f(x , y) is a continuous function of y, 

(c) for fixed y,f(x, y) is a continuous function of x. 

Put 

g(x) = f fix, y) dy (0 < a- < 1 ). 

Is g continuous? 

13. Consider the functions 


fnix) ~ sin nx (n = 1 , 2, 3, . . . , — tt < x < tt) 
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as points of if 2 . Prove that the set of these points is closed and bounded, but 
not compact. 

14 . Prove that a complex function / is measurable if and only if f~ l (V) is measurable 
for every open set V in the plane. 

15 . Let 0t be the ring of all elementary subsets of (0, 1 ]. If 0 < a < b < 1 , define 

<!>([<*> b]) = <f>([a> b)) = </>( ( a , b]) = b)) = b- a , 

but define 


#(0, «) = #«>, b])=\ + b 

if 0 < b < 1. Show that this gives an additive set function <f> on which is not 
regular and which cannot be extended to a countably additive set function on a 
cr-ring. 

16 . Suppose {n k } is an increasing sequence of positive integers and E is the set of all 
x c ( — 7 r, 7 r) at which {sin n k x] converges. Prove that m(E) = 0. Hint: For every 
A c 

J sin n k x dx -* 0 , 

and 

2 J (sin n k x) 2 dx = J ( 1 — cos 2 n k x) dx -> m(A ) as k -> oo. 

17 . Suppose £■<= (— 7 r, 7 r), w(E) >0, 8 >0. Use the Bessel inequality to prove that 
there are at most finitely many integers n such that sin nx>h for all x e E. 

18 . Suppose / 6 2 (/x), g e Prove that 

=j\f\ i diij\g\ t dp 

if and only if there is a constant c such that g(x) = cf(x) almost everywhere. 
(Compare Theorem 1 1 .35.) 
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LIST OF SPECIAL SYMBOLS 

The s>mbols listed below are followed by a brief statement of their meaning and by 
the number of the page on which they are defined. 


6 belongs to 3 

£ does not belong to 3 

<=, ==> inclusion signs 3 

Q rational field 3 

<, <, >, > inequality signs.... 3 

sup least upper bound 4 

inf greatest lower bound 4 

R real field 8 

-f oo, — oo, oo infinities 11,27 

z complex conjugate 14 

Re (z) real part 14 

Im (r) imaginary part 14 

\z\ absolute value 14 

X summation sign 15, 59 

R k euclidean £-space 16 

0 null vector 16 

x y inner product 16 

| x | norm of vector x 16 


{x„} sequence 26 

1J, u union 27 

H, intersection 27 

(a, b) segment 31 

[a, b] interval 31 

E c complement of E 32 

E' limit points of £ 35 

£ closure of £ 35 

lim limit 47 

-> converges to 47, 98 

lim sup upper limit 56 

lim inf lower limit 56 

g * f composition 86 

f(x - (-) right-hand limit 94 

f(x-) left-hand limit 94 

/', f'(x) derivatives 103,112 

£(£,/), £(/>,/, «), L(£,/), L(Pj t a) 
Riemann sums 121,122 
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classes of Riemann (Stieltjes) 


integrable functions 121, 122 

#( X ) space of continuous 

functions 150 

II || norm 140, 150, 326 

exp exponential function 179 

D n Dirichlet kernel 189 

r(x) gamma function 192 

{ei, . . . , e„} standard basis 205 

L(X ), L(X , Y) spaces of linear 

transformations 207 

[A] matrix 210 

Djf partial derivative 215 

V/ gradient 217 

\ classes of differentiable 

functions 219, 235 

det [A] determinant 232 

y f (x) Jacobian 234 

Jacobjan 234 

d(*i, 


/* A: -cell 245 

Q k /^-simplex 247 

dx, basics-form 257 

a multiplication symbol 254 

d differentiation operator 260 

a ) T transform of co 262 

d boundary operator 269 

V x F curl 281 

VF divergence 281 

6 ring of elementary sets 303 

m Lebesgue measure 303, 308 

/x measure 303, 308 

families of measurable sets 305 

{^IP} set with property P 310 

/ + ,/“ positive (negative) part 

of/ 312 

K e characteristic function 313 

if, if(/Lt), if 2 , if 2 (/x) classes of 
Lebesgue-integrable 
functions 315, 326 



INDEX 


Abel, N. H., 75, 174 
Absolute convergence. 71 
of integral. 138 
Absolute value. 14 
Addition ( see Sum) 
Addition formula. 178 
Additivity. 30 I 
Affine chain, 268 
Affine mapping. 266 
Affine simplex. 266 
Algebra. 161 
self-adjoint. 165 
uniformly closed. 161 
Algebraic numbers. 43 
Almost everywhere. 317 
Alternating series. 71 
Analytic function. 172 
Anticommutative law. 256 
Arc. 136 

Area element. 283 
Arithmetic means. 80. 199 
Artin. E., 192. 195 
Associative law. 5. 28. 259 
Axioms. 5 


Baire's theorem. 46. 82 

Ball. 31 

Base, 45 

Basic form, 257 

Basis. 205 

Bellman. R.. 198 

Bessel inequality, 188, 328 

Beta function. 193 

Binomial series, 201 

Bohr-Mollerup theorem. 193 

Borel- measurable function. 313 


Borel set. 309 
Boundary. 269 
Bounded convergence. 322 
Bounded function. 89 
Bounded sequence. 48 
Bounded set. 32 
Brouwer's theorem, 203 
Buck, R.C.. 195 


Cantor. G.. 21. 30. 186 
Cantor set. 41. 81. 138. 168. 309 
Cardinal number. 25 
Cauchy criterion. 54. 59. 147 
Cauchy sequence. 21, 52. 82. 329 
Cauchy's condensation test. 61 
Cell. 31 

^“-equivalence. 280 
Chain. 268 
affine. 268 
differentiable. 270 
Chain rule. 105. 214 
Change of variables. 132. 252. 262 
Characteristic function, 313 
Circle of convergence. 69 
Closed curve. 136 
Closed form. 275 
Closed set. 32 
Closure. 35 

uniform. 151. 161 
Collection. 27 
Column matrix. 217 
Column vector. 210 
Common refinement, 123 
Commutative law. 5. 28 
Compact metric space. 36 
Compact set. 36 


Comparison test. 60 
Complement. 32 
Complete metric space. 54. 82. 
151. 329 

Complete orthonormal set. 331 
Completion. 82 
Complex field. 12. 184 
Complex number. 12 
Complex plane, 1 7 
Component of a function. 87. 215 
Composition. 86. 105, 127. 207 
Condensation point. 45 
Conjugate. 14 
Connected set. 42 
Constant function. 85 
Continuity. 85 
uniform. 90 

Continuous functions, space of. 

150 

Continuous mapping. 85 
Continuously differentiable curve. 
136 

Continuously differentiable map- 
ping. 219 
Contraction. 220 
Convergence. 47 
absolute. 71 
bounded. 322 
dominated. 321 
of integral. 138 
pointwise. 144 
radius of. 69. 79 
of sequences. 47 
of series. 59 
uniform. 147 
Convex function. 101 
Convex set. 3 1 
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Coordinate function. 88 
Coordinates. 16. 205 
Countable additivity. 301 
Countable base. 45 
Countable set. 25 
Cover. 36 

Cunningham. F.. 167 
Curl. 281 
Curve. 136 
closed. 136 

continuously differentiable. 136 
rectifiable, 136 
spacefilling. 168 
Cut. 17 


Davis. P.J.. 192 
Decimals. I 1 
Dedekind. R.. 21 
Dense subset. 9. 32 
Dependent set. 205 
Derivative. 104 
directional. 2 18 
of a form. 260 
of higher order. 110 
of an integral, 133. 236. 324 
integration of. 134. 324 
partial. 2 1 5 
of power series. 173 
total. 213 

of a transformation. 214 
of a vector-valued function. 1 12 
Determinant, 232 
of an operator. 234 
product of. 233 
Diagonal process. 30. 157 
Diameter. 52 

Differentiable function. 104. 212 
Differential. 2 I 3 
Differential equation. I 19. 170 
Differential form (.vrr Form) 
Differentiation ( \cc Derivative) 
Dimension. 205 
Directional derivative. 218 
Diriehlet’s kernel. 189 
Discontinuities. 94 
Disjoint sets. 27 
Distance. 30 

Distributive law. 6. 20. 28 
Divergence. 28 I 
Divergence theorem. 253. 272. 

288 

Divergent sequence. 47 
Divergent series. 59 
Domain, 24 

Dominated convergence theorem. 

155. 167. 321 
Double sequence. 144 


e. 63 

Eberlein. W. F.. 184 
Elementary set. 303 
Empty set. 3 
Equicontinuity. 156 


Equivalence relation. 25 
Euclidean space. 16. 30 
Euler’s constant. 197 
Exact form. 275 
Existence theorem. 170 
Exponential function. 178 
Extended real number system. I I 
Extension. 99 


Family. 27 

Fatou's theorem. 320 
Fejer's kernel. 199 
Fejer's theorem. 199 
Field axioms. 5 
Fine. N. J.. 100 
Finite set. 25 
Fixed point. I 17 
theorems, 1 1 7. 203, 220 
Fleming. W. H.. 280 
Flip. 249 
Form. 254 
basic. 257 
of class / V ". 254 
closed. 275 
derivative of. 260 
exact. 275 

product of. 258. 260 
sum of. 256 
Fourier. J. B.. 186 
Fourier coefficients. 186. 187 
Fourier series. 186. 187. 328 
Function. 24 
absolute value. 88 
analytic. 172 
Borel-measurable. 313 
bounded. 89 
characteristic. 313 
component of, 87 
constant. 85 
continuous. 85 
from left, 97 
from right. 97 

continuously differentiable. 219 
convex. 101 
decreasing. 95 
differentiable. 104. 212 
exponential. 178 
harmonic. 297 
increasing. 95 
inverse. 90 

Lebesgue-integrable. 315 
limit. 144 
linear. 206 
logarithmic. 180 
measurable. 310 
monotonic. 95 

nowhere differentiable continu- 
ous. 154 
one-to-one. 25 
orthogonal, 187 
periodic. 183 
product of. 85 
rational, 88 

Riemann-integrable. 121 


Function: 
simple. 313 
sum of. 85 
summable. 315 
trigonometric. 182 
uniformly continuous. 90 
uniformly differentiable. I 15 
vector- valued. 85 
Fundamental theorem of calculus. 

134. 324 


Gamma function. 192 
Geometric series. 61 
Gradient. 217. 281 
Graph. 99 

Greatest lower bound. 4 
Green's identities. 297 
Green's theorem. 253. 255. 272. 
282 


Half-open interval. 3 1 
Harmonic function, 297 
Havin. V. P.. 113 
Heine- Borel theorem. 39 
Helly's selection theorem, 167 
Herstein. I . N.. 65 
Hewitt. E.. 21 

Higher-order derivative, I 10 
Hilbert space. 332 
Holder's inequality. 139 


/. 13 

Identity operator. 232 

Image. 24 

Imaginary part, 14 

Implicit function theorem. 224 

Improper integral. 139 

Increasing index. 257 

Increasing sequence, 55 

Independent set. 205 

Index of a curve. 20 1 

Infimum. 4 

Infinite series, 59 

Infinite set. 25 

Infinity. I I 

Initial-value problem. 119. 170 
Inner product. 16 
Integrable functions, spaces of, 
315. 326 
Integral: 

countable additivity of, 316 
differentiation of. 133, 236. 324 
Lebesgue, 3 1 4 
lower, 121. 122 
Riemann. 121 
Stieltjes. 122 
upper. 121, 122 
Integral test. 139 
Integration: 

of derivative, 134. 324 
by parts, 134, 139, 141 
Interior. 43 
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Interior point. 32 

Intermediate value. 93. 100, 108 

Intersection. 27 

Interval. 31. 302 

Into, 24 

Inverse function, 90 
Inverse function theorem. 221 
Inverse image. 24 
Inverse of linear operator, 207 
Inverse mapping. 90 
Invertible transformation, 207 
Irrational number. I, 10. 65 
Isolated point. 32 
Isometry. 82. 170 
Isomorphism. 21 


Jacobian. 234 


Kellogg, O. D.. 281 
Kestelman. H.. 167 
Knopp, K.. 21. 63 


Landau. EG. H. 21 
Laplacian. 297 
Least upper bound. 4 
property, 4, 18 
Lebesgue, H.\L.. 186 
Lebesgue-integrable function. 315 
Lebesgue integral. 314 
Lebesgue measure, 308 
Lebesgue’s theorem, 155. 167. 
318. 321 

Left-hand limit. 94 
Leibnitz. G. W„ 71 
Length, 136 

L'Hospital's rule, 109, 1 13 
Limit. 47. 83, 144 
left-hand. 94 
lower, 56 
pointwise. 144 
right-hand. 94 
subsequential. 5 I 
upper. 56 

Limit function, 144 
Limit point. 32 
Line, 17 

Line integral. 255 
Linear combination. 204 
Linear function. 206 
Linear mapping. 206 
Linear operator, 207 
Linear transformation, 206 
Local maximum. 107 
Localization theorem, 190 
Locally one-to-one mapping. 223 
Logarithm. 22, 180 
Logarithmic function, 180 
Lower bound, 3 
Lower integral, 121, 122 
Lower limit. 56 


McShane. E.J., 313 


Mapping. 24 
affine, 266 
continuous. 85 

continuously differentiable, 219 
linear. 206 
open, 100, 223 
primitive. 248 
uniformly continuous. 90 
also Function) 

Matrix. 2 10 
product. 2 1 I 
Maximum. 90 

Mean square approximation. 187 
Mean value theorem, 108. 235 
Measurable function. 310 
Measurable set. 305. 310 
Measurable space. 310 
Measure, 308 
outer, 304 
Measure space, 3 10 
Measure zero, set of, 309. 317 
Mertens. F., 74 
Metric space. 30 
Minimum. 90 
Mobius band, 298 
Monotone convergence theorem. 

318 

Monotonic function, 95. 302 
Monotonic sequence. 55 
Multiplication {see Product) 


Negative number. 7 
Negative orientation. 267 
Neighborhood. 32 
Newton's method. 1 18 
Nijenhuis, A.. 223 
Niven. L. 65. 198 
Nonnegative number, 60 
Norm. 16. 140, 150, 326 
of operator. 208 
Normal derivative, 297 
Normal space, 101 
Normal vector, 284 
Nowhere differentiable function, 
154 

Null space, 228 
Null vector. 16 
Number: 
algebraic. 43 
cardinal. 25 
complex, 12 
decimal. 1 1 
finite, 1 2 

irrational. 1, 10, 65 
negative, 7 
nonnegative. 60 
positive, 7, 8 
rational. I 
real, 8 


One-to-one correspondence, 25 
Onto. 24 
Open cover, 36 


Open mapping. 100. 223 
Open set, 32 
Order. 3. 17 

lexicographic. 22 
Ordered field, 7. 20 
/t-tuple. 16 
pair, 12 
set. 3, 18. 22 
Oriented simplex. 266 
Origin, 16 

Orthogonal set of functions. 187 
Orthonormal set. 187. 327. 331 
Outer measure. 304 


Parameter domain. 254 
Parameter interval. 136 
Parseval's theorem. 191. 198. 328. 

331 

Partial derivative. 2 I 5 
Partial sum. 59. 186 
Partition. 120 
of unity, 25 I 
Perfect set. 32 
Periodic function. 183, 190 
7 T. 183 
Plane. 17 

Poincare's lemma. 275. 280 
Pointwise bounded sequence. 155 
Pointwise convergence. 144 
Polynomial. 88 
trigonometric. 185 
Positive orientation. 267 
Power series. 69. 172 
Primes. 197 
Primitive mapping. 248 
Product. 5 
Cauchy, 73 

of complex nun>bers. I 2 
of determinants. 233 
of field elements. 5 
of forms. 258. 260 
of functions. 85 
inner. 16 
of matrices. 21 I 
of real numbers. 19. 20 
scalar, 16 
of series. 73 
of transformations. 207 
Projection, 228 
Proper subset. 3 


Radius. 3 1, 32 
of convergence. 69. 79 
Range. 24. 207 
Rank, 228 
Rank theorem, 229 
Ratio test. 66 
Rational function. 88 
Rational number. I 
Real field. 8 
Real line. 17 
Real number, 8 
Real part. 14 
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Rearrangement, 75 
Rectifiable curve, 136 
Refinement, 123 
Reflexive property, 25 
Regular set function, 303 
Relatively open set, 35 
Remainder, 21 I, 244 
Restriction, 99 
Riemann, B., 76, 186 
Riemann integral, 121 
Riemann-Stieltjes integral, 122 
Riesz-Fischer theorem 330 
Right-hand limit, 94 
Ring, 301 

Robison, G. B., 184 
Root, 10 
Root test, 65 
Row matrix, 217 


Saddle point, 240 
Scalar product, 16 
Schoenberg. I.J., 168 
Schwarz inequality. 15. 139. 326 
Segment. 31 

Self-adjoint algebra, 165 
Separable space, 45 
Separated sets, 42 
Separation of points, 162 
Sequence, 26 
bounded. 48 
Cauchy, 52, 82, 329 
convergent, 47 
divergent, 47 
double, 144 
of functions, 143 
increasing. 55 
monotonic, 55 
pointwise bounded, 155 
pointwise convergent, 144 
uniformly bounded, 155 
uniformly convergent, 157' 
Series, 59 

absolutely convergent, 71 
alternating. 7 1 
convergent, 59 
divergent. 59 
geometric. 6 1 

nonabsolutely convergent. 72 
power, 69, 172 
product of, 73 
trigonometric, 186 
uniformly convergent, 157 
Set, 3 

at most countable, 25 

Borel. 309 

bounded, 32 

bounded above, 3 

Cantor, 41, 81, 138, 168, 309 

closed, 32 

compact, 36 

complete orthonormal, 331 
connected, 42 
convex, 31 
countable 25 


Set, 

dense. 9. 32 
elementary, 303 
empty, 3 
finite, 25 
independent, 205 
infinite, 25 

measurable, 305. 310 
nonempty, 3 
open, 32 
ordered, 3 
perfect, 32. 4 1 
relatively open, 35 
uncountable, 25, 30, 41 
Set function, 301 
o--ring, 301 

Simple discontinuity. 94 
Simple function. 3 13 
Simplex, 247 
affine, 266 
differentiable. 269 
oriented, 266 
Singer, I. M„ 280 
Solid angle. 294 
Space: 

compact metric, 36 
complete metric, 54 
connected, 42 

of continuous functions. 150 
euclidean. 16 
Hilbert, 332 

of integrable functions, 315, 326 
measurable, 310 
measure, 310 
metric, 30 
normal. 101 
separable, 45 
Span, 204 

Sphere. 272, 277. 294 
Spivak, M„ 272. 280 
Square root, 2, 8 1 , 118 
Standard basis. 205 
Standard presentation, 257 
Standard simplex, 266 
Stark. E. L., 199 
Step function, 129 
Stieltjes integral. 122 
Stirling’s formula. 194, 200 
Stokes' theorem. 253. 272, 287 
Stone- Weierstrass theorem, 162. 
190. 246 

Stromberg, K , 2 I 
Subadditivity, 304 
Subcover, 36 
Subfield, 8. 13 
Subsequence. 5 1 
Subsequential limit, 51 
Subset. 3 
dense, 9. 32 
proper, 3 
Sum, 5 

of complex numbers, 12 
of field elements, 5 
of forms, 256 
of functions, 85 


Sum, 

of linear transformations. 207 
of oriented simplexes, 268 
of real numbers, 18 
of series, 59 
of vectors. 16 
Summation by parts. 70 
Support. 246 
Supremum, 4 
Supremum norm. 150 
Surface, 254 

Symmetric difference, 305 


Tangent plane. 284 

Tangent vector. 286 

Tangential component, 286 

Taylor polynomial, 244 

Taylor’s theorem. M0. 116. 176. 243 

Thorpe. J. A.. 280 

Thurston. H. A., 2 1 

Torus. 239-240, 285 

Total derivative. 213 

Transformation (.see Function; 

Mapping) 

Transitivity. 25 

Triangle inequality. 14. 16. 30. 140 
Trigonometric functions. 182 
Trigonometric polynomial. 185 
Trigonometric series, 186 


Uncountable set. 25. 30. 41 
Uniform boundedness. 155 
Uniform closure. J5I 
Uniform continuity. 90 
Uniform convergence. 147 
Uniformly closed algebra, 161 
Uniformly continuous mapping. 90 
Union, 27 

Uniqueness theorem. I 19. 258 

Unit cube, 247 

Unit vector. 217 

Upper bound. 3 

Upper integral. 121. 122 

Upper limit. 56 


Value. 24 

Variable of integration. 122 
Vector. 16 
Vector field. 281 
Vector space. 16. 204 
Vector-valued function. 85 
derivative of, 112 
Volume, 255. 282 


Weierstrass test, 148 
Weierstrass theorem, 40, 159 
Winding number, 201 


Zero set, 98, 1 17 
Zeta function, 141 



